Title Back Colour Keyoti Title Line Title Curve
Blue Box Top

DictManager merge text files - RapidSpell Web Java - Forum

Welcome Guest Search | Active Topics | Log In | Register

Options
gregcharles
#1 Posted : Thursday, March 6, 2014 9:43:23 PM
Rank: Member

Groups: Registered

Joined: 4/15/2013
Posts: 34
I'm having trouble merging a .txt file into an existing .dict file:

The merge appears to work, but the word count only increases by one, and the first word in the text file still doesn't exist in the dictionary:

From the Swing DictManager tool's status window:

151306 words in dict.
'brillig' NOT found in dict.
Merged words from C:\temp\TestWords.txt.
151307 words in dict.
'brillig' NOT found in dict.

Contents of TestWords.txt:
brillig
mimsey
borogove
mome
outgrabe

Am I using this tool wrong? Are there specific requirements for the .txt file that I may be violating?
Jim
#2 Posted : Thursday, March 6, 2014 11:02:09 PM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
Hi, yes the text file needs to be saved with UTF-16 encoding.

Sorry I realize that isn't obvious.


Jim

-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


gregcharles
#3 Posted : Thursday, March 6, 2014 11:21:31 PM
Rank: Member

Groups: Registered

Joined: 4/15/2013
Posts: 34
Yes, changing the format to Unicode did work. Thanks!

There is a small inconsistency though. I create a user dictionary for words added from the spell check dialog within our application via:

rapidSpell.setParameterValue('default', "UserDictionaryFile",
"../dictionaries/" + getUserName() + ".txt");

(getUserName() is our function for identifying the current logged in user.)

The text files that RapidSpell creates for the users seem to be plain ASCII, so as it stands they cannot be merged into the main dictionary file, unless they are first converted to Unicode. Is there someway to control the encoding used for the user dictionary files?
Jim
#4 Posted : Friday, March 7, 2014 12:16:33 AM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
Actually the user dictionary is UTF-8, which looks like ASCII unless there are chars outside of ASCII in it. Maybe it would be more helpful to change Dict Manager to accept UTF8 as well. I'll get you a new Dict Manager with a merge from UTF 8 menu option.

-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


Jim
#5 Posted : Sunday, March 9, 2014 4:40:03 PM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
Here you go, the new menu item should make sense now

http://keyoti.com/Upload...dFolder/DictManager.jar

-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


gregcharles
#6 Posted : Wednesday, March 12, 2014 7:20:03 PM
Rank: Member

Groups: Registered

Joined: 4/15/2013
Posts: 34
Thanks Jim, I'll try that out! As far as the user dictionary files go, they aren't marked with the leading 0xEF, 0xBB, 0xBF that would be typical for a UTF-8 encoded text file. If I edit one via Notepad, and save it with a UTF-8 encoding, then those bytes get added and that messes up the spell checker. You might want to look for those characters and strip them if they exist.

Another small point: the DictManger tool doesn't prompt to save unsaved modifications on exit. That's giving our testers conniption fits, which they, of course, pass on to me. Is there any way the tool could check if changes have been made (by manual add/remove or by merge) and prompt for a basic "Save/Don't Save/Cancel" if the user tries to exit before saving?
Jim
#7 Posted : Wednesday, March 12, 2014 10:39:02 PM
Rank: Advanced Member

Groups: Administrators, Registered

Joined: 8/13/2004
Posts: 2,667
Location: Canada
Those chars are the byte order mark (BOM), but UTF 8 can come with or without them (think MS tends to use it more than others), but if it's helpful then we can look for it.

Regarding the confirmation, sure, it's been added but I should mention that Dict Manager isn't really an end user tool, it's meant for devs to work on the dictionaries.

http://keyoti.com/Upload...DictManager.jar?refresh



-your feedback is helpful to other users, thank you!

-your feedback is helpful to other users, thank you!


Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.




About | Contact | Site Map | Privacy Policy

Copyright © 2002- Keyoti Inc.