Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Go Back  Xtreme Visual Basic Talk > > > Using Directory.GetFiles() WITH multiple extensions AND sort order


Reply
 
Thread Tools Display Modes
  #1  
Old 09-01-2013, 05:08 AM
Jayme65 Jayme65 is offline
Newcomer
 
Join Date: Jan 2012
Posts: 7
Default Using Directory.GetFiles() WITH multiple extensions AND sort order


Hi,

I have to get a directory file list, filtered on multiple extensions...and sorted!

I use this, which is the fastest way I've found to get dir content filtered on multiple extensions:

Code:
Dim ext As String() = {"*.jpg", "*.bmp","*png"}
Dim files As String() = ext.SelectMany(Function(f) Directory.GetFiles(romPath, f)).ToArray
Array.Sort(files)
and then use an array sort.


I was wondering (and this is my question ) if there would be a way to do the sorting IN the same main line? A kind of:
Code:
Dim files As String() = ext.SelectMany(Function(f) Directory.GetFiles(romPath, f).Order By Name).ToArray
and, if yes, if I would gain speed doing this instead of sorting the array at the end (but I would do my test and report..as soon as I get a solution!!)?
Thanks for your help!!
Reply With Quote
  #2  
Old 09-01-2013, 08:47 AM
AtmaWeapon's Avatar
AtmaWeaponUsing Directory.GetFiles() WITH multiple extensions AND sort order AtmaWeapon is offline
Fabulous Florist

Forum Leader
* Guru *
 
Join Date: Feb 2004
Location: Austin, TX
Posts: 9,500
Default

Short answer: there is, but it won't help.

There's an OrderBy() method that does the same thing the "SQL syntax" does and works like SelectMany(); in this case it takes a function that returns the value used for sorting.

Unfortunately I don't think it will make this go any faster. Directory.GetFiles() returns its results all at once, which means you pay the biggest performance penalty up-front before you've even sorted. There's a Directory.EnumerateFiles() that returns them one at a time, but in order to sort OrderBy() has to iterate over the entire collection anyway.

So as long as "sorted" is a requirement, no variation of the code is going to be significantly faster than any other.
__________________
.NET Resources
My FAQ threads | Tutor's Corner | Code Library
I would bet money 2/3 of .NET questions are already answered in one of these three places.
Reply With Quote
  #3  
Old 09-01-2013, 08:53 AM
Jayme65 Jayme65 is offline
Newcomer
 
Join Date: Jan 2012
Posts: 7
Default

AtmaWeapon,

Thanks for your reply!
You're right, I've tested a:
Code:
myFiles = myExtensions.SelectMany(Function(ext) Directory.GetFiles(myPath, ext)).OrderBy(Function(x) x).ToArray
Which gives exactly the same time!

But, for people interested in this topic, I've found that calling GetFiles once then filtering results by file extension is far better...especially when the number of extensions to look for is raising!

Code:
Dim supportedExtensions As String = "*.zip,*.aaa,*.bbb,*.ccc,*.ddd"
Dim files As String() = Directory.GetFiles(romPath, "*.*", SearchOption.AllDirectories)
Array.Sort(files)

For Each fi As String In files
 If supportedExtensions.Contains(Path.GetExtension(fi)) Then
 ...
 End If
Next
...gives invariably the same amount of time whatever the number of extension is...which is not the case of my previous code.
In my case, on 20000 files, 6 extension types: 0.2sec for this method against 0.6sec for the previous one!
Reply With Quote
  #4  
Old 09-01-2013, 02:16 PM
PlausiblyDamp's Avatar
PlausiblyDampUsing Directory.GetFiles() WITH multiple extensions AND sort order PlausiblyDamp is offline
Ultimate Contributor

Forum Leader
* Expert *
 
Join Date: Nov 2003
Location: Newport, Wales
Posts: 2,058
Default

No idea if the timing would be any different but you could try
Code:
Dim supportedExtensions As String = "*.zip,*.aaa,*.bbb,*.ccc,*.ddd"
Dim files As String() = Directory.GetFiles(romPath, "*.*", SearchOption.AllDirectories)

For Each fi As String In From fi1 In files.OrderBy(Function(x) x) Where supportedExtensions.Contains(Path.GetExtension(fi1))
    ...
Next
as a pure linq solution and remove the explicit loop and array sort.

If you are running a multi-core system and on .Net 4.0 or later you may get a performance improvement by trying
Code:
Dim supportedExtensions As String = "*.zip,*.aaa,*.bbb,*.ccc,*.ddd"
Dim files As String() = Directory.GetFiles(fromPath, "*.*", SearchOption.AllDirectories)
   
For Each fi As String In From fi1 In files.AsParallel().OrderBy(Function(x) x) Where supportedExtensions.Contains(Path.GetExtension(fi1))
    ...
Next
Like any multithreaded code though performance may or may not improve depending on the size / type of data.
__________________
Intellectuals solve problems; geniuses prevent them.
-- Albert Einstein

Posting Guidelines Forum Rules Use the code tags
Reply With Quote
  #5  
Old 09-01-2013, 08:17 PM
AtmaWeapon's Avatar
AtmaWeaponUsing Directory.GetFiles() WITH multiple extensions AND sort order AtmaWeapon is offline
Fabulous Florist

Forum Leader
* Guru *
 
Join Date: Feb 2004
Location: Austin, TX
Posts: 9,500
Default

In my opinion there's dang near never a reason to use the SQL-like syntax, and especially never a reason to mix it with the functional syntax, but I like the idea of trying to parallelize it.

Intuitively, I think it'd be more efficient to do the Where() first. GetFiles() has to enumerate the hard drive once no matter what. OrderBy() is going to have to sort and is likely O(n log(n)), but if you let the Where clause filter first, you'll end up with a smaller overall n.

Code:
Dim supportedExtensions As String = "*.zip,*.aaa,*.bbb,*.ccc,*.ddd"
Dim fileSelector = Function(filePath As String) As Boolean
                    Return supportedExtensions.Contains(Path.GetExtension(filePath))
                   End Function
Dim filteredFiles = Directory.GetFiles(fromPath, "*.*", SearchOption.AllDirectories) _
                        .AsParallel() _
                        .Where(fileSelector) _
                        .OrderBy(Function(x) x)
   
For Each fi As String In filteredFiles
    ...
Next
No clue if that's actually valid, C# is much more sane with respect to line breaks and SQL. I don't have VS at home to test it out. I like moving as much out of the For Each statement as I can, though.
__________________
.NET Resources
My FAQ threads | Tutor's Corner | Code Library
I would bet money 2/3 of .NET questions are already answered in one of these three places.
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump

Advertisement:





Free Publications
The ASP.NET 2.0 Anthology
101 Essential Tips, Tricks & Hacks - Free 156 Page Preview. Learn the most practical features and best approaches for ASP.NET.
subscribe
Programmers Heaven C# School Book -Free 338 Page eBook
The Programmers Heaven C# School book covers the .NET framework and the C# language.
subscribe
Build Your Own ASP.NET 3.5 Web Site Using C# & VB, 3rd Edition - Free 219 Page Preview!
This comprehensive step-by-step guide will help get your database-driven ASP.NET web site up and running in no time..
subscribe
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
 
Using Directory.GetFiles() WITH multiple extensions AND sort order
Using Directory.GetFiles() WITH multiple extensions AND sort order
 
-->