Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters
Reading and converting foreign language characters Reading and converting foreign language characters
Reading and converting foreign language characters
Go Back  Xtreme Visual Basic Talk > > > Reading and converting foreign language characters


Reply
 
Thread Tools Display Modes
  #1  
Old 04-07-2015, 12:11 AM
HappyJoni's Avatar
HappyJoni HappyJoni is offline
Centurion
 
Join Date: Jun 2003
Location: New Jersey
Posts: 169
Default Reading and converting foreign language characters


I have a function that I'm passing a string to that may or not contain characters from a foreign language. If there are any foreign language characters, I'd like to change them to their 'English' version.

Basically, I want to replace these characters:


With these characters:
SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy

I'm not sure why my code isn't working, it's really strange. I have test data, every line has at least a few foreign characters. The first line gets read and shows only English characters in the string, foreign characters are just gone. Should be 16 characters total in the record, 6 of which were foreign characters, and the string shows the record with only 10 characters. The rest of the records show exactly as they are in input file, they never change and get written out exactly as they were to being with.

I believe part of it had to do with how my StreamReader is set up. I've tried encoding for UTF32, UTF7 and UTF8. Each of those gave me different output, none of which exactly what I was going for.

Here's my code:


Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

Dim FileRdr As StreamReader = New StreamReader("m:\test\charReplace.txt", System.Text.Encoding.UTF7)

Dim ReplaceWrtr As StreamWriter
ReplaceWrtr = System.IO.File.CreateText("M:\test\CharReplaceOut.txt")

Do While FileRdr.Peek() >= 0
Dim currentRec As String = FileRdr.ReadLine
removeAccent(currentRec)
ReplaceWrtr.WriteLine(currentRec)
Loop

ReplaceWrtr.Close()

End Sub
'***********************
'Replace foreign language characters with english characters
Function removeAccent(ByVal myString As String)
Dim A As String = "--"
Dim B As String = "--"
Const AccChars As String = ""
Const RegChars As String = "SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
For i As Integer = 1 To Len(AccChars)
A = Mid(AccChars, i, 1)
B = Mid(RegChars, i, 1)
myString = Replace(myString, A, B)
Next
removeAccent = myString
End Function

Any ideas? Is this even possible? Any input is great appreciated!

Thanks!
Joni
Reply With Quote
  #2  
Old 04-07-2015, 02:35 AM
PlausiblyDamp's Avatar
PlausiblyDampReading and converting foreign language characters PlausiblyDamp is offline
Ultimate Contributor

Forum Leader
* Expert *
 
Join Date: Nov 2003
Location: Newport, Wales
Posts: 2,058
Default

If you try your function by passing in a hard coded string containing the characters you want to remove does it work? If so then you can be sure the problem is with reading the file rather than the function itself.

Where is this input file coming from? Would you be able to find out what format it is saved in? It might not be unicode but ANSI, that would mean you would need to know the correct code page to read it correctly as well.

Also as a final point what are you going to be doing with this "English" string? Changing the letters like this will completely ruin any meaning - letters cannot simply be replaced in this way if the input is containing genuine use of these characters.
__________________
Intellectuals solve problems; geniuses prevent them.
-- Albert Einstein

Posting Guidelines Forum Rules Use the code tags
Reply With Quote
  #3  
Old 04-07-2015, 11:15 AM
HappyJoni's Avatar
HappyJoni HappyJoni is offline
Centurion
 
Join Date: Jun 2003
Location: New Jersey
Posts: 169
Wink Thank You!

Thanks for you response PlausiblyDamp

I'm aware that removing the accents change the meaning of the word, but that's what the user is requesting. I tried talking them into leaving it as is but they wouldn't go for it. I just do what I'm told

I tried your suggestion and passed a hard coded string to the function.. it worked perfectly! Looks like it's the Streamreader that's causing a problem.

Would you happen to know how I should set it up? Maybe you know of a list somewhere that I could reference that shows the different encoding types that could be used? The file I'm using for testing is just a basic text file I created in Notepad. Eventually, I'm going to have to use this code to read data from a data table (created from an Excel file) and make the text changes...

You're always a great help! Thanks so much, I really appreciate everything you do here
Reply With Quote
  #4  
Old 04-07-2015, 05:04 PM
PlausiblyDamp's Avatar
PlausiblyDampReading and converting foreign language characters PlausiblyDamp is offline
Ultimate Contributor

Forum Leader
* Expert *
 
Join Date: Nov 2003
Location: Newport, Wales
Posts: 2,058
Default

If you are saving the file in notepad you can choose the encoding from the save dialog (should be a drop down list somewhere...)

If the file is being provided by a 3rd party then it can get a little more interesting unless they can provide details about the format. Other than that there isn't really a reliable way of guessing the format...
__________________
Intellectuals solve problems; geniuses prevent them.
-- Albert Einstein

Posting Guidelines Forum Rules Use the code tags
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

Advertisement:





Free Publications
The ASP.NET 2.0 Anthology
101 Essential Tips, Tricks & Hacks - Free 156 Page Preview. Learn the most practical features and best approaches for ASP.NET.
subscribe
Programmers Heaven C# School Book -Free 338 Page eBook
The Programmers Heaven C# School book covers the .NET framework and the C# language.
subscribe
Build Your Own ASP.NET 3.5 Web Site Using C# & VB, 3rd Edition - Free 219 Page Preview!
This comprehensive step-by-step guide will help get your database-driven ASP.NET web site up and running in no time..
subscribe
Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters Reading and converting foreign language characters
Reading and converting foreign language characters
Reading and converting foreign language characters
 
Reading and converting foreign language characters
Reading and converting foreign language characters
 
-->