======================================================================== * packaging/README ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html Copyright (C) 2000-2003, International Business Machines Corporation and others. All Rights Reserved. This directory contains information, input files and scripts for packaging ICU using specific packaging tools. We assume that the packager is familiar with the tools and procedures needed to build a package for a given packaging method (for example, how to use dpkg-buildpackage(1) on Debian GNU/Linux, or rpm(8) on distributions that use RPM packages). Please read the file PACKAGES if you are interested in packaging ICU yourself. It describes what the different packages should be, and what their contents are. ======================================================================== * source/extra/uconv/README ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html Copyright (c) 2002, International Business Machines Corporation and others. All Rights Reserved. The uconv command is an iconv(1)-like conversion / transcoding program. Please check its manual page, or run uconv -h, for help. Help, as well as error messages, are displayed through the use of a resource bundle. Please contact Steven Loomis if you want to offer a translation of these messages for a particular locale. uconv was originally written and contributed to icuapps by Jonas Utterström , and offered simple conversion and a way to know which encodings were available. It has since then be moved to the main ICU distribution and converted to the C conversion API, and is maintained by Yves Arrouye who seems to always be looking for one more feature or option to add to the tool. ======================================================================== * source/samples/break/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. break: Boundary Analysis This sample demonstrates Using ICU to determine the linguistic boundaries within text Files: break.cpp Main source file in C++ ubreak.c Main source file in C break.sln Windows MSVC workspace. Double-click this to get started. break.vcproj Windows MSVC project file To Build break on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\break\break.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the break directory, e.g. cd c:\icu\source\samples\break\debug 4. Run it (Warning: Be careful, 'break' is also a system command on many systems) .\break To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/break gmake ICU_PREFIX=/source/samples/break gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH break Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/cal/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. icucal: a sample program which displays the calendar. This sample demonstrates Formatting a calendar Outputting text in the default codepage to the console Files: cal.c Main source file uprint.h codepage output convenience header uprint.h codepage output convenience implementation cal.sln Windows MSVC workspace. Double-click this to get started. cal.vcproj Windows MSVC project file To Build icucal on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\cal\cal.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the cal directory, e.g. cd c:\icu\source\samples\cal\debug 4. Run it cal To Build on Unixes 1. Build ICU. icucal is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/cal gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH cal Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/case/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2003-2005, International Business Machines Corporation and others. All Rights Reserved. case: case mapping This sample demonstrates Using ICU to convert between different cases Files: case.cpp Main source file in C++ ucase.c Main source file in C case.sln Windows MSVC workspace. Double-click this to get started. case.vcproj Windows MSVC project file To Build case on Windows 1. Install and build ICU 2. In MSVC, open the solution file icu\samples\case\case.sln (or, use the workspace All, in icu\samples\all\all.sln ) 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the case directory, e.g. cd c:\icu\source\samples\case\debug 4. Run it case To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/case gmake ICU_PREFIX=/source/samples/case gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH case Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/citer/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2003-2010, International Business Machines Corporation and others. All Rights Reserved. citer: Character Iteration This sample demonstrates Demonstrating ICU's CharacterIterator Files: citer.cpp Main source file in C++ citer.sln Windows MSVC workspace. Double-click this to get started. citer.vcproj Windows MSVC project file To Build citer on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\citer\citer.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the citer directory, e.g. cd c:\icu\source\samples\citer\debug (note that it may be in a different relative directory than most of the other samples). 4. Run it citer To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/citer gmake ICU_PREFIX=/source/samples/citer gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH citer Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/coll/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. coll: a sample program which compares 2 strings with a user-defined collator. This sample demonstrates Creating a user-defined collator Comparing 2 string using the collator created Files: coll.c Main source file coll.sln Windows MSVC workspace. Double-click this to get started. coll.vcproj Windows MSVC project file To Build coll on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\coll\coll.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the coll directory, e.g. cd c:\icu\source\samples\coll\debug 4. Run it coll [options*] -source source_string -target target_string To Build on Unixes 1. Build ICU. coll is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/coll gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH cal Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/csdet/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2001-2010 International Business Machines Corporation and others. All Rights Reserved. uresb: Resource Bundle This sample demonstrates Using ICU's CharSet Detection API Files: csdet.c Main source file *.txt Various sample .txt files To Build uresb on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\uresb\uresb.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the uresb directory, e.g. cd c:\icu\source\samples\uresb\debug 4. Run it (with a locale name, ex. english) csdet eucJP.txt WARNING: The .txt files must be in the same directory as the executable, which is not the case by default on some systems. To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/uresb gmake ICU_PREFIX= To Run on Unixes cd /source/samples/uresb gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH csdet eucJP.txt Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/date/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. icudate: a sample program which displays the current date This sample demonstrates Formatting a date Outputting text in the default codepage to the console Files: date.c Main source file uprint.h codepage output convenience header uprint.h codepage output convenience implementation date.sln Windows MSVC workspace. Double-click this to get started. date.vcproj Windows MSVC project file To Build icudate on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\date\date.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the icudate directory, e.g. cd c:\icu\source\samples\date\debug 4. Run it (Warning: Be careful, 'date' is also a system command on many systems) .\date To Build on Unixes 1. Build ICU. icudate is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/date gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH date Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/datefmt/README.TXT ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. IMPORTANT: This sample was originally intended as an exercise for the ICU Workshop (September 2000). The code currently provided in the solution file is the answer to the exercises, each step can still be found in the 'answers' subdirectory. ** Workshop homepage is: http://www.icu-project.org/docs/workshop_2000/agenda.html #Date/Time/Number Formatting Support 9:30am - 10:30am Alan Liu Topics: 1. What is the date/time support in ICU? 2. What is the timezone support in ICU? 3. What kind of formatting and parsing support is available in ICU, i.e. NumberFormat, DateFormat, MessageFormat? INSTRUCTIONS ------------ This exercise was first developed and tested on ICU release 1.6.0, Win32, Microsoft Visual C++ 6.0. It should work on other ICU releases and other platforms as well. MSVC: Open the file "datefmt.sln" in Microsoft Visual C++. Unix: - Build and install ICU with a prefix, for example '--prefix=/home/srl/ICU' - Set the variable ICU_PREFIX=/home/srl/ICU and use GNU make in this directory. - You may use 'make check' to invoke this sample. PROBLEMS -------- Problem 0: Set up the program, build it, and run it. To start with, the program prints out a list of languages. Problem 1: Basic Date Formatting (Easy) Create a calendar, and use it to get the UDate for June 4, 1999, 0:00 GMT (or any date of your choosing). You will have to create a TimeZone (use the createZone() function already defined in main.cpp) and a Calendar object, and make the calendar use the time zone. Once you have the UDate, create a DateFormat object in each of the languages in the LANGUAGE array, and display the date in that language. Use the DateFormat::createDateInstance() method to create the date formatter. Problem 2: Date Formatting, Specific Time Zone (Medium) To really localize a time display, one can also specify the time zone in which the time should be displayed. For each language, also create different time zones from the TIMEZONE list. To format a date with a specific calendar and zone, you must deal with three objects: a DateFormat, a Calendar, and a TimeZone. Each object must be linked to another in correct sequence: The Calendar must use the TimeZone, and the DateFormat must use the Calendar. DateFormat =uses=> Calendar =uses=> TimeZone Use either setFoo() or adoptFoo() methods, depending on where you want to have ownership. NOTE: It's not always desirable to change the time to a local time zone before display. For instance, if some even occurs at 0:00 GMT on the first of the month, it's probably clearer to just state that. Stating that it occurs at 5:00 PM PDT on the day before in the summer, and 4:00 PM PST on the day before in the winter will just confuse the issue. NOTES ----- To see a list of system TimeZone IDs, use the TimeZone::create- AvailableIDs() methods. Alternatively, look at the file icu/docs/tz.htm. This has a hyperlinked list of current system zones. ANSWERS ------- The exercise includes answers. These are in the "answers" directory, and are numbered 1, 2, etc. If you get stuck and you want to move to the next step, copy the answers file into the main directory in order to proceed. E.g., "main_1.cpp" contains the original "main.cpp" file. "main_2.cpp" contains the "main.cpp" file after problem 1. Etc. Have fun! ======================================================================== * source/samples/legacy/README ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002, International Business Machines Corporation and others. All Rights Reserved. This example demonstrates running an instance of ICU 1.8.1. together with a current version of ICU. It only tests u_getVersion and several collation APIs. Generally, one should be able to simultaneously use one or more versions of ICU 2.0 or higher and one version of ICU 1.8.1 or lower. What is it all about: Let's say you have a 10 Tb database indexed using ICU 1.8.1. sortkeys. New ICU comes out, with neat new features you would like to use, but also with new sortkeys and you don't care to reindex your 10 Tb database. What to do then??? You can use ICU 1.8.1. in one of your compilation units and current version in all the others. So, you can use old collation until you decide to reindex. You cannot mix two versions of ICU in the same compilation unit. You cannot automatically use more than one legacy version of ICU. In order to make the compilation unit use old version of ICU, you have to do a couple of things: 1) change it's include path so that it includes header files from the old versions 2) explicitly add old libraries to the linker. 3) make sure old data can be found (if legacy code needs data). Building and running of the example: Linux: To make it work, you should build and install both the current ICU and ICU 1.8.1. Put both data libraries to wherever ICU_DATA points (usually it is $(prefix)/share/icu/$(icu_version)/). If data libraries are used, then check for $(prefix)/lib/icu/1.8.1 which should contain libicudata.so and libicudt18*.so 2. Copy libicuuc.so.18* and libicui18n.so.18* to $(prefix)/lib directory, together with current libraries). 3. Should work on other Unixes. Change $ICU_PREFIX to point to the current installation, and $ICU_LEGACY to point to 1.8.1 installation. $ICU_LEGACY is needed solely to access the 1.8.1 include directory through $LEGACY_INCLUDE variable, so if you want to move the 1.8.1. include directory, you can set $LEGACY_INCLUDE directly to that directory. Run make check. You should get two different libraries running at the same time. Win32: Build both current ICU and ICU 1.8.1. Take icuuc18.dll, icuin18.dll and icudt18l.dll and put them somewhere in PATH (a sane place would be wherever current dlls go). Edit the include directory for oldcol.cpp so that it points to the include directory of ICU 1.8.1. Edit the two library entries with path so that they point to .lib files for your version of ICU. Hit F7, followed by ctrl-F5. Troubleshooting (all platforms): Sample won't compile: this is quite unlikely, but the most probable reason is that include files cannot be found. Sample won't link: The path for 1.8.1. libraries is broken. Edit it so that it reflects the path to your libraries. Linker says: "Undefined symbol u_getVersion()" (or something similar): path to 1.8.1. libraries is bad. Linker says: "Undefined symbol u_getVersion()_X_Y" (or something similar): path to current libraries is bad. Legacy crashes horribly: Sorry, didn't put any error checking. If legacy crashes that's most probably because it cannot find the data libraries. You can see which data library is not found by the part of the program that is running. Make sure program can find tha data library either by putting it where ever ICU_DATA points to OR by putting the DLL version of the data library somewhere on your PATH. ======================================================================== * source/samples/msgfmt/README.TXT ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. IMPORTANT: This sample was originally intended as an exercise for the ICU Workshop (September 2000). The code currently provided in the solution file is the answer to the exercises, each step can still be found in the 'answers' subdirectory. http://www.icu-project.org/docs/workshop_2000/agenda.html Day 2: September 12th 2000 Pre-requisites: 1. All the hardware and software requirements from Day 1. 2. Attended or fully understand Day 1 material. 3. Read through the ICU user's guide at http://www.icu-project.org/userguide/. #Date/Time/Number Formatting Support 9:30am - 10:30am Alan Liu Topics: 1. What is the date/time support in ICU? 2. What is the timezone support in ICU? 3. What kind of formatting and parsing support is available in ICU, i.e. NumberFormat, DateFormat, MessageFormat? INSTRUCTIONS ------------ This exercise was first developed and tested on ICU release 1.6.0, Win32, Microsoft Visual C++ 6.0. It should work on other ICU releases and other platforms as well. MSVC: Open the file "msgfmt.sln" in Microsoft Visual C++. Unix: - Build and install ICU with a prefix, for example '--prefix=/home/srl/ICU' - Set the variable ICU_PREFIX=/home/srl/ICU and use GNU make in this directory. - You may use 'make check' to invoke this sample. PROBLEMS -------- Problem 0: Set up the program, build it, and run it. To start with, the program prints out the word "Message". Problem 1: Basic Message Formatting (Easy) Use a MessageFormat to create a message that prints out "Received argument(s) on .", where n is the number of command line arguments (use argc-1), and d is the date (use Calendar::getNow()). HINT: Your message pattern should have a "number" element and a "date" element, and you will need to use Formattable. Problem 2: ChoiceFormat (Medium) We can do better than "argument(s)". Instead, we can display more idiomatic strings, such as "no arguments", "one argument", "two arguments", and for higher values, we can use a number format. This kind of value-based switching is done using a ChoiceFormat. However, you seldom needs to create a ChoiceFormat by itself. Instead, most of the time you will supply the ChoiceFormat pattern within a MessageFormat pattern. Use a ChoiceFormat pattern within the MessageFormat pattern, instead of the "number" element, to display more idiomatic strings. EXTRA: Embed a number element within the choice element to handle values greater than two. ANSWERS ------- The exercise includes answers. These are in the "answers" directory, and are numbered 1, 2, etc. If you get stuck and you want to move to the next step, copy the answers file into the main directory in order to proceed. E.g., "main_1.cpp" contains the original "main.cpp" file. "main_2.cpp" contains the "main.cpp" file after problem 1. Etc. Have fun! ======================================================================== * source/samples/numfmt/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. numfmt: a sample program which displays number formatting in C and C++ This sample demonstrates Formatting a number Outputting text in the default codepage to the console Files: main.cpp Main source file in C++ capi.c C version util.cpp formatted output convenience implementation util.h formatted output convenience header numfmt.sln Windows MSVC workspace. Double-click this to get started. numfmt.vcproj Windows MSVC project file To Build on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\numfmt\numfmt.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the numfmt directory, e.g. cd c:\icu\source\samples\numfmt\debug 4. Run it numfmt To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/numfmt gmake ICU_PREFIX=/source/samples/numfmt gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH numfmt Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/props/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. props: Unicode Character Properties This sample demonstrates Using ICU to determine the properties of Unicode characters Files: props.cpp Main source file in C++ props.sln Windows MSVC workspace. Double-click this to get started. props.vcproj Windows MSVC project file To Build props on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\props\props.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the props directory, e.g. cd c:\icu\source\samples\props\debug 4. Run it props To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/props gmake ICU_PREFIX=/source/samples/props gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH props Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/readme.txt ======================================================================== ## Copyright (C) 2016 and later: Unicode, Inc. and others. ## License & terms of use: http://www.unicode.org/copyright.html#License ## ## Copyright (c) 2002-2010, International Business Machines Corporation ## and others. All Rights Reserved. This directory contains sample code Below is a short description of the contents of this directory. break - demonstrates how to use BreakIterators in C and C++. cal - prints out a calendar. case - demonstrates how to do Unicode case conversion in C and C++. csdet - demonstrates using ICU's CharSet Detection API date - prints out the current date, localized. datefmt - an exercise using the date formatting API layout - demonstrates the ICU LayoutEngine legacy - demonstrates using two versions of ICU in one application msgfmt - demonstrates the use of the Message Format numfmt - demonstrates the use of the number format props - demonstrates the use of Unicode properties strsrch - demonstrates how to search for patterns in Unicode text using the usearch interface. translit - demonstrates the use of ICU transliteration uciter8.c - demonstrates how to leniently read 8-bit Unicode text. ucnv - demonstrates the use of ICU codepage conversion udata - demonstrates the use of ICU low level data routines (reader/writer in 'all' MSVC solution) ufortune - demonstrates packaging and use of resources in an application ugrep - demonstrates ICU Regular Expressions. uresb - demonstrates building and loading resource bundles ustring - demonstrates ICU string manipulation functions == * Where can I find more sample code? - The "uconv" utility is a full-featured command line application. It is normally built with ICU, and is located in icu/source/extra/uconv - The "icuapps" CVS module contains other applications and libraries not included with ICU. You can check it out from the CVS command line by using for example, "cvs co icuapps" instead of "cvs co icu", or through WebCVS at http://dev.icu-project.org/cgi-bin/viewcvs.cgi/icuapps/ == * How do I build the samples? - See the Readme in each subdirectory To build all samples at once: Windows MSVC: - build ICU - open 'all' project file in 'all' subdirectory - build project - sample executables will be located in /x86/Debug folders of each sample subdirectory Unix: - build and install (make install) ICU - be sure 'icu-config' is accessible from the PATH - type 'make all-samples' from this directory (other targets: clean-samples, check-samples) Note: 'make all-samples' won't work correctly in out of source builds. - legacy and layout are not included in these lists, please see their individual readmes. ======================================================================== * source/samples/strsrch/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. strsrch: a sample program which finds the occurrences of a pattern string in a source string, using user-defined collation rules. This sample demonstrates Creating a user-defined string search mechanism. Finding all occurrences of a pattern string in a given source string. Files: strsrch.c Main source file strsrch.sln Windows MSVC workspace. Double-click this to get started. strsrch.vcproj Windows MSVC project file To Build strsrch on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\strsrch\strsrch.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the strsrch directory, e.g. cd c:\icu\source\samples\strsrch\debug 4. Run it strsrch [options*] -source source_string -pattern pattern_string To Build on Unixes 1. Build ICU. strsrch is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/strsrch gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH cal Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/translit/README.TXT ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. IMPORTANT: This sample was originally intended as an exercise for the ICU Workshop (September 2000). The code currently provided in the solution file is the answer to the exercises, each step can still be found in the 'answers' subdirectory. http://www.icu-project.org/docs/workshop_2000/agenda.html Day 2: September 12th 2000 Pre-requisite: 1. All the hardware and software requirements from Day 1. 2. Attended or fully understand Day 1 material. 3. Read through the ICU user's guide at http://www.icu-project.org/userguide/. #Transformation Support 10:45am - 12:00pm Alan Liu Topics: 1. What is the Unicode normalization? 2. What kind of case mapping support is available in ICU? 3. What is Transliteration and how do I use a Transliterator on a document? 4. How do I add my own Transliterator? INSTRUCTIONS ------------ This exercise was developed and tested on ICU release 1.6.0, Win32, Microsoft Visual C++ 6.0. It should work on other ICU releases and other platforms as well. MSVC: Open the file "translit.sln" in Microsoft Visual C++. Unix: - Build and install ICU with a prefix, for example '--prefix=/home/srl/ICU' - Set the variable ICU_PREFIX=/home/srl/ICU and use GNU make in this directory. - You may use 'make check' to invoke this sample. PROBLEMS -------- Problem 0: To start with, the program prints out a series of dates formatted in Greek. Set up the program, build it, and run it. Problem 1: Basic Transliterator (Easy) The Greek text shows up almost entirely as Unicode escapes. These are unreadable on a US machine. Use an existing system transliterator to transliterate the Greek text to Latin so it can be phonetically read on a US machine. If you don't know the names of the system transliterators, use Transliterator::getAvailableID() and Transliterator::countAvailableIDs(), or look directly in the index table icu/data/translit_index.txt. Problem 2: RuleBasedTransliterator (Medium) Some of the text is still unreadable and shows up as Unicode escape sequences. Create a RuleBasedTransliterator to change the unreadable characters to close ASCII equivalents. For example, the rule "\u00C0 > A;" will change an 'A' with a grave accent to a plain 'A'. To save typing, use UnicodeSets to handle ranges of characters. See the included file "U0080.pdf" for a table of the U+00C0 to U+00FF Unicode block. Problem 3: Transliterator subclassing; Normalizer (Difficult) The rule-based approach is flexible and, in most cases, the best choice for creating a new transliterator. Sometimes, however, a more elegant algorithmic solution is available. Instead of typing in a list of rules, you can write C++ code to accomplish the desired transliteration. Use a Normalizer to remove accents from characters. You will need to convert each character to a sequence of base and combining characters by applying a canonical denormalization transformation. Then discard the combining characters (the accents etc.) leaving the base character. Wrap this all up in a subclass of the Transliterator class that overrides the pure virtual handleTransliterate() method. ANSWERS ------- The exercise includes answers. These are in the "answers" directory, and are numbered 1, 2, etc. In some cases new files that the user needs to create are included in the answers directory. If you get stuck and you want to move to the next step, copy the answers file into the main directory in order to proceed. E.g., "main_1.cpp" contains the original "main.cpp" file. "main_2.cpp" contains the "main.cpp" file after problem 1. Etc. Have fun! ======================================================================== * source/samples/uciter8/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2003-2005, International Business Machines Corporation and others. All Rights Reserved. uciter8: Lenient reading of 8-bit Unicode with a UCharIterator This sample demonstrates reading 8-bit Unicode text leniently, accepting a mix of UTF-8 and CESU-8 and also accepting single surrogates. UTF-8-style macros are defined as well as a UCharIterator. The macros are incomplete (do not assemble code points from pairs of surrogates) but sufficient for the iterator. If you wish to use the lenient-UTF/CESU-8 UCharIterator in a context outside of this sample, then copy the uit_len8.c file, as well as either the uit_len8.h header or just the prototype that it contains. *** Warning: *** This UCharIterator reads an arbitrary mix of UTF-8 and CESU-8 text. It does not conform to any one Unicode charset specification, and its use may lead to security risks. Files: uciter8.c Main source file in C uit_len8.c Lenient-UTF/CESU-8 UCharIterator implementation uit_len8.h Header file with the prototoype for the lenient-UTF/CESU-8 UCharIterator uciter8.sln Windows MSVC workspace. Double-click this to get started. uciter8.vcproj Windows MSVC project file To Build uciter8 on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\uciter8\uciter8.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the uciter8 directory, e.g. cd c:\icu\source\samples\uciter8\debug 4. Run it uciter8 To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/uciter8 gmake ICU_PREFIX=/source/samples/uciter8 gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH uciter8 Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/ucnv/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (C) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. convsamp: a sample program which demonstrates using ICU conversion This sample demonstrates Opening and closing converters using the C api String manipulation in C Writing a custom conversion callback function Files: convsamp.c Main source file flagcb.h codepage output convenience header flagcb.c codepage output convenience implementation ucnv.sln Windows MSVC workspace. Double-click this to get started. ucnv.vcproj Windows MSVC project file To Build ucnv on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ucnv\ucnv.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ufortune directory, e.g. cd c:\icu\source\samples\ucnv\debug 4. Run it ucnv WARNING: The .bin and .txt files must be in the same directory as the executable, which is not the case by default on some systems. To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Build set the variable ICU_PREFIX= gmake all To Run on Unixes cd /source/samples/ucnv gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH convsamp Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/udata/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. udata: Low level ICU data This sample demonstrates Using the low level ICU data handling interfaces (udata) to create and later access user data. Files: writer.c C source for Writer application, will generate data file to be read by Reader. reader.c C source for Reader application, will read file created by Writer udata.sln Windows MSVC workspace. Double-click this to get started. udata.vcproj Windows MSVC project file To Build udata on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\udata\udata.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the udata directory, e.g. cd c:\icu\source\samples\udata\debug 4. Run it writer reader IMPORTANT: On some systems, the reader and writer executables may not be in the same directory. If this is the case, this will likely cause a problem with reader looking for the .dat file in the wrong directory). To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile You will need to set ICU_PATH to the location of your ICU source tree, for example ICU_PATH=/home/srl/icu (containing source, etc.) cd /source/samples/udata gmake ICU_PATH= ICU_PREFIX=/source/samples/udata gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH writer reader Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/ufortune/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. ufortune: a sample program demonstrating the use of ICU resource files by an application. This sample demonstrates Defining resources for use by an application Compiling and packaging them into a dll Referencing the resource-containing dll from application code Loading resource data using ICU's API Files: ./ufortune.c source code for the sample ./ufortune.sln Windows MSVC workspace. Double-click this to get started. ./ufortune.vcproj Windows MSVC project file. ./Makefile Makefile for Unixes. Needs gmake. resources/root.txt Default resources (text for messages in English) resources/es.txt Spanish language resources source file.. resources/res-file-list.txt List of resource source files to be built resources/Makefile Makefile for compiling resources, for Unixes. To Build ufortune on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ufortune\ufortune.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ufortune directory, e.g. cd c:\icu\source\samples\ufortune\debug 4. Run it ufortune To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Build the sample cd /source/samples/ufortune export ICU_PREFIX= gmake To Run on Unixes cd /source/samples/ufortune gmake check or export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH ufortune Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/ugrep/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. ugrep: a sample program demonstrating the use of ICU regular expression API. usage: ugrep [options] pattern [file ...] --help Output a brief help message -n, --line-number Prefix each line of output with the line number within its input file. -V, --version Output the program version number The program searches for the specified regular expression in each of the specified files, and outputs each matching line. Input files are in the system default (locale dependent) encoding, unless they begin with a BOM, in which case they are assumed to be in the UTF encoding specified by the BOM. Program output is always in the system's default 8 bit code page. Files: ./ugrep.c source code for the sample ./ugrep.sln Windows MSVC workspace. Double-click this to get started. ./ugrep.vcproj Windows MSVC project file. ./Makefile Makefile for Unixes. Needs gmake. To Build ugrep on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ugrep\ugrep.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ugrep directory, e.g. cd c:\icu\source\samples\ugrep\debug 4. Run it ugrep ... To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Build the sample Put the install directory containing icu-config on the $PATH. This will generally be /bin cd /source/samples/ugrep gmake To Run on Unixes cd /source/samples/ugrep export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH ugrep ... Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/uresb/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2001-2010 International Business Machines Corporation and others. All Rights Reserved. uresb: Resource Bundle This sample demonstrates Building a resource bundle Using ICU to print data from a resource bundle Files: uresb.c Main source file in C uresb.sln Windows MSVC workspace. Double-click this to get started. uresb.vcproj Windows MSVC project file resources.dsp Windows project file for resources resources.mak Windows makefile for resources root.txt Root resource bundle en.txt English translation sr.txt Serbian translation (cp1251) To Build uresb on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\uresb\uresb.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the uresb directory, e.g. cd c:\icu\source\samples\uresb\debug 4. Run it (with a locale name, ex. english) uresb en WARNING: The .txt files must be in the same directory as the executable, which is not the case by default on some systems. To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/uresb gmake ICU_PREFIX= To Run on Unixes cd /source/samples/uresb gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH uresb Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/samples/ustring/readme.txt ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html#License Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. ustring: Unicode String Manipulation This sample demonstrates Using ICU to manipulate UnicodeString objects Files: ustring.cpp Main source file in C++ ustring.sln Windows MSVC workspace. Double-click this to get started. ustring.vcproj Windows MSVC project file To Build ustring on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ustring\ustring.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ustring directory, e.g. cd c:\icu\source\samples\ustring\debug 4. Run it ustring To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/ustring gmake ICU_PREFIX=/source/samples/ustring gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH ustring Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * source/tools/gencolusb/README.md ======================================================================== Unsafe-Backward Collator Data === This directory contains tools to build the `source/i18n/collunsafe.h` precomputed data. See [Makefile](./Makefile) for more details. * Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html * Copyright (c) 2015, International Business Machines Corporation and others. All Rights Reserved. ======================================================================== * source/tools/genren/README ======================================================================== Copyright (C) 2016 and later: Unicode, Inc. and others. License & terms of use: http://www.unicode.org/copyright.html Copyright (c) 2002-2011, International Business Machines Corporation and others. All Rights Reserved. The genren.pl script is used to generate source/common/unicode/urename.h header file, which is needed for renaming the ICU exported names. This script is intended to be used on Linux, although it should work on any platform that has Perl and nm command. Makefile may need to be updated, it's not 100% portable. It also does not currently work well in an out-of-source situation. The following instructions are for Linux version. - urename.h file should be generated after implementation is complete for a release. - the version number for a release should be set according to the list in source/common/unicode/uvernum.h - Note: If you are running the script in a clean checkout, you must run the runConfigureICU at least once before running the make install-header command below. Before generating urename.h, the layout engine header files must be installed from the harfbuzz project. This is prerequisite for the icu layoutex (Paragraph Layout) project, which is subject to renaming. (Using the svn command is the simplest way of getting just the files from one subdirectory of the git project.) cd icu4c/source svn export https://github.com/behdad/icu-le-hb/trunk/src layout - Regenerate urename.h cd icu4c/source/tools/genren make install-header - urename.h will be updated in icu/source/common/unicode/urename.h **in your original source directory** - Warnings concerning bad namespace (not 'icu') on UCaseMap can be ignored. - The defines for "__bss_start", "_edata", and "_end" should be ignored/removed (See ICU-20176). - Eyeball the new file for errors cd icu4c/source git diff common/unicode/urename.h - Other make targets here clean - cleans out intermediate files urename.h -just builds ./urename.h ======================================================================== * source/tools/tzcode/readme.txt ======================================================================== * Copyright (C) 2016 and later: Unicode, Inc. and others. * License & terms of use: http://www.unicode.org/copyright.html ********************************************************************** * Copyright (c) 2003-2014, International Business Machines * Corporation and others. All Rights Reserved. ********************************************************************** * Author: Alan Liu * Created: August 18 2003 * Since: ICU 2.8 ********************************************************************** Note: this directory currently contains tzcode as of tzcode2014b.tar.gz with localtime.c patches from tzcode2014b.tar.gz ---------------------------------------------------------------------- OVERVIEW This file describes the tools in icu/source/tools/tzcode The purpose of these tools is to process the zoneinfo or "Olson" time zone database into a form usable by ICU4C (release 2.8 and later). Unlike earlier releases, ICU4C 2.8 supports historical time zone behavior, as well as the full set of Olson compatibility IDs. References: ICU4C: http://www.icu-project.org/ Olson: ftp://ftp.iana.org/tz/releases/ ---------------------------------------------------------------------- ICU4C vs. ICU4J For ICU releases >= 2.8, both ICU4C and ICU4J implement full historical time zones, based on Olson data. The implementations in C and Java are somewhat different. The C implementation is a self-contained implementation, whereas ICU4J uses the underlying JDK 1.3 or 1.4 time zone implementation. Older versions of ICU (C and Java <= 2.6) implement a "present day snapshot". This only reflects current time zone behavior, without historical variation. Furthermore, it lacks the full set of Olson compatibility IDs. ---------------------------------------------------------------------- BACKGROUND The zoneinfo or "Olson" time zone package is used by various systems to describe the behavior of time zones. The package consists of several parts. E.g.: Index of ftp://ftp.iana.org/tz/releases/ tzcode2014b.tar.gz 172 KB 3/25/2014 05:11:00 AM tzdata2014b.tar.gz 216 KB 3/25/2014 05:11:00 AM ICU only uses the tzdataYYYYV.tar.gz files, where YYYY is the year and V is the version letter ('a'...'z'). This directory has partial contents of tzcode checked into ICU ---------------------------------------------------------------------- HOWTO 0. Note, these instructions will only work on POSIX type systems. 1. Obtain the current versions of tzdataYYYYV.tar.gz (aka `tzdata') from the FTP site given above. Either manually download or use wget: $ cd {path_to}/icu/source/tools/tzcode $ wget "ftp://ftp.iana.org/tz/releases/tzdata*.tar.gz" 2. Copy only one tzdata*.tar.gz file into the icu/source/tools/tzcode/ directory (this directory). *** Make sure you only have ONE FILE named tzdata*.tar.gz in the directory. 3. Build ICU normally. You will see a notice "updating zoneinfo.txt..." ### Following instructions for ICU maintainers only ### 4. Obtain the current version of tzcodeYYYY.tar.gz from the FTP site to this directory. 5. Run make target "check-dump". This target extract makes the original tzcode and compile the original tzdata with icu supplemental data (icuzones). Then it makes zdump / icuzdump and dump all time transitions for all ICU timezone to files under zdumpout / icuzdumpout directory. When they produce different results, the target returns the error. 6. Don't forget to check in the new zoneinfo64.txt (from its location at {path_to}/icu/source/data/misc/zoneinfo64.txt) into SVN. ======================================================================== * LICENSE ======================================================================== COPYRIGHT AND PERMISSION NOTICE (ICU 58 and later) Copyright © 1991-2020 Unicode, Inc. All rights reserved. Distributed under the Terms of Use in https://www.unicode.org/copyright.html. Permission is hereby granted, free of charge, to any person obtaining a copy of the Unicode data files and any associated documentation (the "Data Files") or Unicode software and any associated documentation (the "Software") to deal in the Data Files or Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, and/or sell copies of the Data Files or Software, and to permit persons to whom the Data Files or Software are furnished to do so, provided that either (a) this copyright and permission notice appear with all copies of the Data Files or Software, or (b) this copyright and permission notice appear in associated Documentation. THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA FILES OR SOFTWARE. Except as contained in this notice, the name of a copyright holder shall not be used in advertising or otherwise to promote the sale, use or other dealings in these Data Files or Software without prior written authorization of the copyright holder. --------------------- Third-Party Software Licenses This section contains third-party software notices and/or additional terms for licensed third-party software components included within ICU libraries. 1. ICU License - ICU 1.8.1 to ICU 57.1 COPYRIGHT AND PERMISSION NOTICE Copyright (c) 1995-2016 International Business Machines Corporation and others All rights reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, provided that the above copyright notice(s) and this permission notice appear in all copies of the Software and that both the above copyright notice(s) and this permission notice appear in supporting documentation. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. Except as contained in this notice, the name of a copyright holder shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Software without prior written authorization of the copyright holder. All trademarks and registered trademarks mentioned herein are the property of their respective owners. 2. Chinese/Japanese Word Break Dictionary Data (cjdict.txt) # The Google Chrome software developed by Google is licensed under # the BSD license. Other software included in this distribution is # provided under other licenses, as set forth below. # # The BSD License # http://opensource.org/licenses/bsd-license.php # Copyright (C) 2006-2008, Google Inc. # # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # # Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided with # the distribution. # Neither the name of Google Inc. nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission. # # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, # INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR # BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF # LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS # SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # # # The word list in cjdict.txt are generated by combining three word lists # listed below with further processing for compound word breaking. The # frequency is generated with an iterative training against Google web # corpora. # # * Libtabe (Chinese) # - https://sourceforge.net/project/?group_id=1519 # - Its license terms and conditions are shown below. # # * IPADIC (Japanese) # - http://chasen.aist-nara.ac.jp/chasen/distribution.html # - Its license terms and conditions are shown below. # # ---------COPYING.libtabe ---- BEGIN-------------------- # # /* # * Copyright (c) 1999 TaBE Project. # * Copyright (c) 1999 Pai-Hsiang Hsiao. # * All rights reserved. # * # * Redistribution and use in source and binary forms, with or without # * modification, are permitted provided that the following conditions # * are met: # * # * . Redistributions of source code must retain the above copyright # * notice, this list of conditions and the following disclaimer. # * . Redistributions in binary form must reproduce the above copyright # * notice, this list of conditions and the following disclaimer in # * the documentation and/or other materials provided with the # * distribution. # * . Neither the name of the TaBE Project nor the names of its # * contributors may be used to endorse or promote products derived # * from this software without specific prior written permission. # * # * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS # * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # * REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, # * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES # * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR # * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, # * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # * OF THE POSSIBILITY OF SUCH DAMAGE. # */ # # /* # * Copyright (c) 1999 Computer Systems and Communication Lab, # * Institute of Information Science, Academia # * Sinica. All rights reserved. # * # * Redistribution and use in source and binary forms, with or without # * modification, are permitted provided that the following conditions # * are met: # * # * . Redistributions of source code must retain the above copyright # * notice, this list of conditions and the following disclaimer. # * . Redistributions in binary form must reproduce the above copyright # * notice, this list of conditions and the following disclaimer in # * the documentation and/or other materials provided with the # * distribution. # * . Neither the name of the Computer Systems and Communication Lab # * nor the names of its contributors may be used to endorse or # * promote products derived from this software without specific # * prior written permission. # * # * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS # * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # * REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, # * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES # * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR # * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, # * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # * OF THE POSSIBILITY OF SUCH DAMAGE. # */ # # Copyright 1996 Chih-Hao Tsai @ Beckman Institute, # University of Illinois # c-tsai4@uiuc.edu http://casper.beckman.uiuc.edu/~c-tsai4 # # ---------------COPYING.libtabe-----END-------------------------------- # # # ---------------COPYING.ipadic-----BEGIN------------------------------- # # Copyright 2000, 2001, 2002, 2003 Nara Institute of Science # and Technology. All Rights Reserved. # # Use, reproduction, and distribution of this software is permitted. # Any copy of this software, whether in its original form or modified, # must include both the above copyright notice and the following # paragraphs. # # Nara Institute of Science and Technology (NAIST), # the copyright holders, disclaims all warranties with regard to this # software, including all implied warranties of merchantability and # fitness, in no event shall NAIST be liable for # any special, indirect or consequential damages or any damages # whatsoever resulting from loss of use, data or profits, whether in an # action of contract, negligence or other tortuous action, arising out # of or in connection with the use or performance of this software. # # A large portion of the dictionary entries # originate from ICOT Free Software. The following conditions for ICOT # Free Software applies to the current dictionary as well. # # Each User may also freely distribute the Program, whether in its # original form or modified, to any third party or parties, PROVIDED # that the provisions of Section 3 ("NO WARRANTY") will ALWAYS appear # on, or be attached to, the Program, which is distributed substantially # in the same form as set out herein and that such intended # distribution, if actually made, will neither violate or otherwise # contravene any of the laws and regulations of the countries having # jurisdiction over the User or the intended distribution itself. # # NO WARRANTY # # The program was produced on an experimental basis in the course of the # research and development conducted during the project and is provided # to users as so produced on an experimental basis. Accordingly, the # program is provided without any warranty whatsoever, whether express, # implied, statutory or otherwise. The term "warranty" used herein # includes, but is not limited to, any warranty of the quality, # performance, merchantability and fitness for a particular purpose of # the program and the nonexistence of any infringement or violation of # any right of any third party. # # Each user of the program will agree and understand, and be deemed to # have agreed and understood, that there is no warranty whatsoever for # the program and, accordingly, the entire risk arising from or # otherwise connected with the program is assumed by the user. # # Therefore, neither ICOT, the copyright holder, or any other # organization that participated in or was otherwise related to the # development of the program and their respective officials, directors, # officers and other employees shall be held liable for any and all # damages, including, without limitation, general, special, incidental # and consequential damages, arising out of or otherwise in connection # with the use or inability to use the program or any product, material # or result produced or otherwise obtained by using the program, # regardless of whether they have been advised of, or otherwise had # knowledge of, the possibility of such damages at any time during the # project or thereafter. Each user will be deemed to have agreed to the # foregoing by his or her commencement of use of the program. The term # "use" as used herein includes, but is not limited to, the use, # modification, copying and distribution of the program and the # production of secondary products from the program. # # In the case where the program, whether in its original form or # modified, was distributed or delivered to or received by a user from # any person, organization or entity other than ICOT, unless it makes or # grants independently of ICOT any specific warranty to the user in # writing, such person, organization or entity, will also be exempted # from and not be held liable to the user for any such damages as noted # above as far as the program is concerned. # # ---------------COPYING.ipadic-----END---------------------------------- 3. Lao Word Break Dictionary Data (laodict.txt) # Copyright (c) 2013 International Business Machines Corporation # and others. All Rights Reserved. # # Project: http://code.google.com/p/lao-dictionary/ # Dictionary: http://lao-dictionary.googlecode.com/git/Lao-Dictionary.txt # License: http://lao-dictionary.googlecode.com/git/Lao-Dictionary-LICENSE.txt # (copied below) # # This file is derived from the above dictionary, with slight # modifications. # ---------------------------------------------------------------------- # Copyright (C) 2013 Brian Eugene Wilson, Robert Martin Campbell. # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, # are permitted provided that the following conditions are met: # # # Redistributions of source code must retain the above copyright notice, this # list of conditions and the following disclaimer. Redistributions in # binary form must reproduce the above copyright notice, this list of # conditions and the following disclaimer in the documentation and/or # other materials provided with the distribution. # # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS # FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, # INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, # STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # OF THE POSSIBILITY OF SUCH DAMAGE. # -------------------------------------------------------------------------- 4. Burmese Word Break Dictionary Data (burmesedict.txt) # Copyright (c) 2014 International Business Machines Corporation # and others. All Rights Reserved. # # This list is part of a project hosted at: # github.com/kanyawtech/myanmar-karen-word-lists # # -------------------------------------------------------------------------- # Copyright (c) 2013, LeRoy Benjamin Sharon # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. Redistributions in binary form must reproduce the # above copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials provided # with the distribution. # # Neither the name Myanmar Karen Word Lists, nor the names of its # contributors may be used to endorse or promote products derived # from this software without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND # CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, # INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS # BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED # TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON # ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR # TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF # THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # -------------------------------------------------------------------------- 5. Time Zone Database ICU uses the public domain data and code derived from Time Zone Database for its time zone support. The ownership of the TZ database is explained in BCP 175: Procedure for Maintaining the Time Zone Database section 7. # 7. Database Ownership # # The TZ database itself is not an IETF Contribution or an IETF # document. Rather it is a pre-existing and regularly updated work # that is in the public domain, and is intended to remain in the # public domain. Therefore, BCPs 78 [RFC5378] and 79 [RFC3979] do # not apply to the TZ Database or contributions that individuals make # to it. Should any claims be made and substantiated against the TZ # Database, the organization that is providing the IANA # Considerations defined in this RFC, under the memorandum of # understanding with the IETF, currently ICANN, may act in accordance # with all competent court orders. No ownership claims will be made # by ICANN or the IETF Trust on the database or the code. Any person # making a contribution to the database or code waives all rights to # future claims in that contribution or in the TZ Database. 6. Google double-conversion Copyright 2006-2011, the V8 project authors. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Google Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.