Arietta i18n is a CLI workflow tool that uses ChatGPT for automated i18n.
Table of Contents
- [x] 🤖 Utilize ChatGPT for automated i18n translation
- [x] ✂️ Support automatic splitting of large files without worrying about ChatGPT token limits.
- [x] ♻️ Support incremental i18n updates, automatically extract new content based on
entry
files. - [x] 🗂️ Support single file mode
en_US.json
and folder modeen_US/common.json
to work perfectly withi18next
. - [x] 🌲 Support
flat
andtree
structure for locale files. - [x] 🛠️ Support customizing OpenAI models, API proxies, and temperature.
- [x] 📝 Support automated i18n translation of
Markdown
files.
To install Arietta i18n, run the following command:
npm install -g @arietta-studio/arietta-i18n
[!IMPORTANT]
Please make sure you haveNode.js
version >= 18.
To initialize the Arietta i18n configuration, run the following command:
$ arietta-i18n -o # or use the full flag --option
[!IMPORTANT]
To use AI auto-generation, you need to fill in the OpenAI Token in the settings.
# Translate Locale files
$ arietta-i18n # or $ arietta-i18n locale
# Translate Markdown files
$ arietta-i18n md
# Run i18n translation and markdown translation simultaneously
$ arietta-i18n --with-md
# Specify the configuration file
$ arietta-i18n -c './custom-config.js' # or use the full flag --config
You can choose any configuration method in cosmiconfig format
-
i18n
property inpackage.json
-
.i18nrc
file in JSON or YAML format -
.i18nrc.json
,.i18nrc.yaml
,.i18nrc.yml
,.i18nrc.js
,.i18nrc.cjs
-
defineConfig
provides a secure definition method that can be imported from@arietta-studio/arietta-i18n
[!TIP]
This project provides a secure definition method
defineConfig
that can be imported from@arietta-studio/arietta-i18n
This project provides some additional configuration items set with environment variables:
Environment Variable | Required | Description | Example |
---|---|---|---|
OPENAI_API_KEY |
Yes | This is the API key you apply on the OpenAI account page | sk-xxxxxx...xxxxxx |
OPENAI_PROXY_URL |
No | If you manually configure the OpenAI interface proxy, you can use this configuration item to override the default OpenAI API request base URL |
https://api.chatanywhere.cn/v1 The default value is https://api.openai.com/v1
|
Property Name | Required | Type | Default Value | Description |
---|---|---|---|---|
entry | * |
string |
- | Entry file or folder |
entryLocale | * |
string |
- | Language to use as translation reference |
modelName | string |
gpt-3.5-turbo |
Model to use | |
output | * |
string |
- | Location to store localized files |
outputLocales | * |
string[] |
[] |
All the languages to be translated |
reference | string |
- | Provide some context for more accurate translations | |
splitToken | number |
- | Split the localized JSON file by tokens, automatically calculated by default | |
temperature | number |
0 |
Sampling temperature to use | |
concurrency | number |
5 |
Number of concurrently pending promises returned | |
experimental | experimental |
{} |
Experimental features, see below | |
markdown | markdown |
{} |
See markdown configuration below |
Property Name | Required | Type | Default Value | Description |
---|---|---|---|---|
jsonMode | boolean |
false |
Enable gpt force JSON output for stability (only supported by new models after November 2023) |
const { defineConfig } = require('@arietta-studio/arietta-i18n');
module.exports = defineConfig({
entry: 'locales/en_US.json',
entryLocale: 'en_US',
output: 'locales',
outputLocales: ['lt_LT'],
});
{
"entry": "locales/en_US.json",
"entryLocale": "en_US",
"output": "locales",
"outputLocales": ["lt_LT"]
}
{
"...": "...",
"i18n": {
"entry": "locales/en_US.json",
"entryLocale": "en_US",
"output": "locales",
"outputLocales": ["lt_LT"]
}
}
There are two types of file structures supported: flat
and tree
.
A flat structure means that all translations for different languages are stored in a single file, as shown below:
- locales
- en_US.json
- lt_LT.json
- ...
[!TIP]
The
flat structure
requires configuring theentry
property in the configuration file to the corresponding JSON file Example
{
"entry": "locales/en.json",
"entryLocale": "en_US",
"output": "locales",
"outputLocales": ["lt_LT"]
}
A tree structure means that the translations for each language are stored in separate language folders, as shown below:
- locales
- en_US
- common.json
- header.json
- subfolder
- ...
- lt_LT
- common.json
- header.json
- subfolder
- ...
[!TIP]
The
tree structure
requires configuring theentry
property in the configuration file to the corresponding folder Example
{
"entry": "locales/en_US",
"entryLocale": "en_US",
"output": "locales",
"outputLocales": ["lt_LT"]
}
Use the arietta-i18n
command to generate i18n files automatically:
$ arietta-i18n
Property Name | Required | Type | Default | Description |
---|---|---|---|---|
entry | * |
string[] |
[] |
Entry file or folder, supports glob
|
entryLocale | string |
Inherit parent locale | Reference language for translation | |
entryExtension | string |
.md |
Entry file extension | |
exclude | string[] |
[] |
Files to be filtered, supports glob
|
|
outputLocales | string[] |
Inherit parent locale | All languages to be translated | |
outputExtensions | function |
(locale) => '.{locale}.md' |
Output file extension generation | |
mode |
string ,mdast ,function
|
string |
Translation mode selection, explained below | |
translateCode | boolean |
false |
Whether to translate code blocks under mdast , other modes are invalid |
By default, the translated file names are generated as .{locale}.md
. You can customize the output file extensions with outputExtensions
.
[!NOTE]
In the example below, the entry file extension is
.md
, but we want the output file extension for thelt-LT
translation to be.md
, while other languages keep the default extensions.
module.exports = {
markdown: {
entry: ['./README.md', './docs/**/*.md'],
entryLocale: 'en-US',
entryExtension: '.md',
outputLocales: ['lt-LT'],
outputExtensions: (locale, { getDefaultExtension }) => {
if (locale === 'en-US') return '.md';
return getDefaultExtension(locale);
},
},
};
outputExtensions
supports the followingprops
:
interface OutputExtensionsProps {
/**
* @description The locale of the translated file to output
*/
locale: string;
config: {
/**
* @description The content of the translated file to input
*/
fileContent: string;
/**
* @description The path of the translated file to input
*/
filePath: string;
/**
* @description The default method for generating extensions
*/
getDefaultExtension: (locale: string) => string;
};
}
mode
is used to specify the translation mode, which supports two modes and custom generation modes.
-
string
- Translates the completemarkdown
content. -
mdast
- Parses the text withmdast
structure and translates thetext value
content. To translate code blocks, you need to enabletranslateCode
.
[!WARNING]
In
mdast
mode, the content to be translated will be reduced to a minimum, removing most markdown syntax structures and links. This mode can greatly reduce token consumption, but it may result in inaccurate translation results.
The translated files will be generated in the same directory as the entry file, with the corresponding language suffix added to the extension:
- README.md
- README.lt-LT.md
- docs
- usage.md
- usage.lt-LT.md
- subfolder
- ...
[!TIP]
Use the arietta-i18n md
command to automate the generation of i18n files:
$ arietta-i18n md
You can use Github Codespaces for online development:
Alternatively, you can clone the repository and run the following command for local development:
$ git clone https://github.com/arietta-studio/arietta-tools.git
$ cd arietta-tools
$ bun install
$ cd packages/arietta-i18n
$ bun dev
We welcome contributions in all forms. If you're interested in contributing code, you can check our GitHub Issues, show off your skills, and demonstrate your ideas.
- langchainjs - https://github.com/hwchase17/langchainjs
- ink - https://github.com/vadimdemedes/ink
- transmart - https://github.com/Quilljou/transmart
Copyright © 2024 Arietta Studio.
This project is licensed under the MIT license.