Corpus Aligner
From SpartanWiki
This page contains steps for using the GlobalSight Corpus Aligner. The Corpus Aligner can be used to align source and target translations of the same content in order to produce translated pairs suitable for inclusion in a translation memory.
For the purposes of this example, let's say you want to align the attached en-US.txt (English) and fr-FR.txt (French) files. Right-click the links and save them somewhere on your computer if you want to follow along.
Setup
- Download the Corpus Aligner from the GlobalSightSaaS Support page.
- Make sure that you have the following two language pairs in GlobalSight.
- English (United States) [en_US] --> French (France) [fr_FR]
- French (France) [fr_FR] --> English (United States) [en_US]
Note
The reason you need French (France) [fr_FR] --> English (United States) [en_US] is because you need to have a French source locale on the server to upload the fr-FR.txt file into. The only way you can do that is to have a language pair with fr-FR as the source.
Usage
- Use the menu option and select Data Source > Upload.
- Create a job with any name and select English (United States) as the source. Choose any project.
- When the Upload screen comes up, select the en-US.txt file and upload it into the system. At the Upload Summary screen, click Done. You don't need to Import the file. You just need to upload it.
- Use the menu option and select Data Source > Upload.
- Create a job with any name and select French (France) as the source. Choose any project. If you didn't create a language pair with "French (France)", you wouldn't be able to upload the fr-FR.txt file.
- When the Upload screen comes up, select the fr-FR.txt file and upload it into the system. At the Upload Summary screen, click Done. You don't need to Import the file. You just need to upload it.
- Use the menu option and select Data Source > Corpus Aligner > Create Aligner Package.
- Enter a package name, select the appropriate file type (txt in this case), source locale, target locale, encodings and click Next.
- In the "Select Files to Align" screen, drill down into the source and target folders until you are able to select en-US.txt and fr-FR.txt. Select the file pairs, click Next, Create Package.
- When the package is ready, it will appear in the "Download Aligner Package" screen. Select the package and click Download.
- Save the file on your desktop and run GAlign 1.0, the aligner tool. Click File > Open Project for the downloaded package.
- You should see the segments now. You can right click on the segment icons to "Disconnect Alignment" or drag from one segment to another to align them. Once you are done with the manual alignment, save the alignments with Align > Save Page.
- Use the menu option File > Prepare for Upload to prepare the package for upload.
- Use the menu option Data Source > Corpus Aligner. Click Upload Aligner Package. Select the package and a TM and click Upload Package.
- Use the menu option Setup > Translation Memories to verify the data.
