X-Git-Url: http://git.shadowcat.co.uk/gitweb/gitweb.cgi?a=blobdiff_plain;f=Todo-5.6;h=8dcb9be8d1242238d1fdfb04b9ac33b8e6641d90;hb=76ced9add7b621dfc9d4ecb534aeea8e131a418a;hp=44f0bb2b9871128f366df4300cf05f3a05aa244a;hpb=1d82895fed2c6ca734f2f029d10ba9d0a6bb568a;p=p5sagit%2Fp5-mst-13.2.git diff --git a/Todo-5.6 b/Todo-5.6 index 44f0bb2..8dcb9be 100644 --- a/Todo-5.6 +++ b/Todo-5.6 @@ -1,6 +1,3 @@ -Bugs - fix small memory leaks on compile-time failures - Unicode support finish byte <-> utf8 and localencoding <-> utf8 conversions make substr($bytestr,0,0,$charstr) do the right conversion @@ -13,10 +10,27 @@ Unicode support - a way to set default disciplines for all handle constructors: use open IN => ":any", OUT => ":utf8", SYS => ":utf16" eliminate need for "use utf8;" - autoload utf8_heavy.pl's swash routines in swash_init() autoload byte.pm when byte:: is seen by the parser check uv_to_utf8() calls for buffer overflow - (see also "Locales", "Regexen", and "Miscellaneous" ) + make \uXXXX (and \u{XXXX}?) where XXXX are hex digits + to work similarly to Unicode tech reports and Java + notation \uXXXX (and already existing \x{XXXX))? + more than four hexdigits? make also \U+XXXX work? + overloadable regex assertions? e.g. in Thai \b cannot + be deduced by any simple character class boundary rules, + word boundaries must algorithmically computed + + see ext/Encode/Todo for notes and references about proper detection + of malformed UTF-8 + + SCSU? http://www.unicode.org/unicode/reports/tr6/ + Collation? http://www.unicode.org/unicode/reports/tr10/ + Normalization? http://www.unicode.org/unicode/reports/tr15/ + EBCDIC? http://www.unicode.org/unicode/reports/tr16/ + Regexes? http://www.unicode.org/unicode/reports/tr18/ + Case Mappings? http://www.unicode.org/unicode/reports/tr21/ + + See also "Locales", "Regexen", and "Miscellaneous". Multi-threading support "use Thread;" under useithreads @@ -44,24 +58,63 @@ Namespace cleanup Configure make configuring+building away from source directory work (VPATH et al) - _r support - cross-compilation configuring + this is related to: cross-compilation configuring (see Todo) + _r support (see Todo for mode detailed description) POSIX 1003.1 1996 Edition support--realtime stuff: POSIX semaphores, message queues, shared memory, realtime clocks, timers, signals (the metaconfig units mostly already exist for these) + PREFERABLY AS AN EXTENSION UNIX98 support: reader-writer locks, realtime/asynchronous IO + PREFERABLY AS AN EXTENSION + IPv6 support: see RFC2292, RFC2553 + PREFERABLY AS AN EXTENSION + there already is Socket6 in CPAN + +Long doubles + figure out where the PV->NV->PV conversion gets it wrong at least + in AIX and Tru64 (V5.0 and onwards) when using long doubles: see the + regexp tricks we had to insert to t/comp/use.t and t/lib/bigfltpm.t, + (?:9|8999\d+) and the like. + +64-bit support + Configure probe for quad_t, uquad_t, and (argh) u_quad_t, they might + be in some systems the only thing working as quadtype and uquadtype. + more pain: long_long, u_long_long. Locales deprecate traditional/legacy locales? + How do locales work across packages? figure out how to support Unicode locales suggestion: integrate the IBM Classes for Unicode (ICU) - http://www10.software.ibm.com/developerworks/opensource/icu/index.html - and check out also the Locale Converter: + http://oss.software.ibm.com/developerworks/opensource/icu/project/ + ICU is "portable, open-source Unicode library with: + charset-independent locales (with multiple locales + simultaneously supported in same thread; character + conversions; formatting/parsing for numbers, currencies, + date/time and messages; message catalogs (resources); + transliteration, collation, normalization, and text + boundaries (grapheme, word, line-break))". + Check out also the Locale Converter: http://alphaworks.ibm.com/tech/localeconverter - locales across packages? + There is also the iconv interface, either from XPG4 or GNU (glibc). + iconv is about character set conversions. + Either ICU or iconv would be valuable to get integrated + into Perl, Configure already probes for libiconv and . Regexen make RE engine thread-safe + a way to do full character set arithmetics: now one can do + addition, negate a whole class, and negate certain subclasses + (e.g. \D, [:^digit:]), but a more generic way to add/subtract/ + intersect characters/classes, like described in the Unicode technical + report on Regular Expression Guidelines, + http://www.unicode.org/unicode/reports/tr18/ + (amusingly, the TR notes that difference and intersection + can be done using "Perl-style look-ahead") + difference syntax? maybe [[:alpha:][^abc]] meaning + "all alphabetic expect a, b, and c"? or [[:alpha:]-[abc]]? + (maybe bad, as we explicitly disallow such 'ranges') + intersection syntax? maybe [[..]&[...]]? POSIX [=bar=] and [.zap.] would nice too but there's no API for them =bar= could be done with Unicode, though, see the Unicode TR #15 about normalization forms: @@ -69,11 +122,16 @@ Regexen this is also a part of the Unicode 3.0: http://www.unicode.org/unicode/uni2book/u2.html executive summary: there are several different levels of 'equivalence' + trie optimization: factor out common suffixes (and prefixes?) + from |-alternating groups (both for exact strings and character + classes, use lookaheads?) approximate matching Security use fchown, fchmod (and futimes?) internally when possible use fchdir(how portable?) + create secure reliable portable temporary file modules + audit the standard utilities for security problems and fix them Reliable Signals custom opcodes @@ -86,12 +144,31 @@ Win32 stuff work out DLL versioning Miscellaneous + introduce @( and @) because group names can have spaces add new modules (Archive::Tar, Compress::Zlib, CPAN::FTP?) - sub-second sleep? alarm? time? (integrate Time::HiRes?) - floating point handling: nans, infinities, fp exception masks, etc + sub-second sleep()? alarm()? time()? (integrate Time::HiRes? + Configure doesn't yet probe for usleep/nanosleep/ualarm but + the units exist) + floating point handling: nans, infinities, fp exception masks, etc. + At least the following interfaces exist: fp_classify(), fp_class(), + class(), isinf(), isfinite(), finite(), isnormal(), unordered(), + , (there are metaconfig units for all these), + fp_setmask(), fp_getmask(), fp_setround(), fp_getround() + (no metaconfig units yet for these). + Don't forget finitel(), fp_classl(), fp_class_l(), (yes, both do, + unfortunately, exist), and unorderedl(). + PREFERABLY AS AN EXTENSION. + As of 5.6.1 there is cpp macro Perl_isnan(). + fix the basic arithmetics (+ - * / %) to preserve IVness/UVness if + both arguments are IVs/UVs: it sucks that one cannot see + the 'carry flag' (or equivalent) of the CPU from C, + C is too high-level... replace pod2html with new PodtoHtml? (requires other modules from CPAN) automate testing with large parts of CPAN - Unicode collation? + turn Cwd into an XS module? (Configure already probes for getcwd()) + mmap for speeding up input? (Configure already probes for the mmap family) + sendmsg, recvmsg? (Configure doesn't probe for these but the units exist) + setitimer, getitimer? (the metaconfig units exist) Ongoing keep filenames 8.3 friendly, where feasible @@ -106,3 +183,5 @@ Documentation spot-check all new modules for completeness better docs for pack()/unpack() reorg tutorials vs. reference sections + make roffitall to be dynamical about its pods and libs +