Class FileUtility


  • public class FileUtility
    extends java.lang.Object
    Author:
    ubaldino
    • Constructor Summary

      Constructors 
      Constructor Description
      FileUtility()  
    • Method Summary

      Modifier and Type Method Description
      static java.lang.String filenameCleaner​(java.lang.String fname)
      Another utility to deal with unicode in filenames
      static java.lang.String generateUniqueFilename​(java.lang.String F, java.lang.String Ext)
      Generate some filename with a unique date/time stamp
      static java.lang.String generateUniquePath​(java.lang.String D, java.lang.String F, java.lang.String Ext)
      Generate some path with a unique date/time stamp
      static java.lang.String getBasename​(java.lang.String p, java.lang.String ext)
      get the base name of a file, given any file extension.
      static java.lang.String getFileDescription​(java.lang.String url)
      Get a plain language name of the type of file.
      static java.io.FilenameFilter getFilenameFilter​(java.lang.String ext)
      Simple filter
      static java.io.InputStreamReader getInputStream​(java.io.File f, java.lang.String enc)  
      static java.io.InputStreamReader getInputStream​(java.lang.String fname, java.lang.String enc)  
      static java.io.InputStreamReader getInputStreamReader​(java.io.File f, java.lang.String enc)
      Getting an input stream from a file.
      static java.io.OutputStreamWriter getOutputStream​(java.lang.String fname, java.lang.String enc)
      Caller is responsible for write flush, close, etc.
      static java.io.OutputStreamWriter getOutputStream​(java.lang.String fname, java.lang.String enc, boolean append)
      Caller is responsible for write flush, close, etc.
      static java.io.File getParent​(java.io.File f)  
      static java.io.File getSafeDir​(java.io.File dir, java.lang.String dupeMarker, int maxDups)
      Get a directory that does not conflict with an existing directory.
      static java.io.File getSafeFile​(java.io.File f, java.lang.String dupeMarker, int maxDups)  
      static java.lang.String getValidFilename​(java.lang.String path)
      On occasion file path may contain unicode chars, however as the is encoded, it may not be decodable by OS/FS.
      static boolean isArchiveFile​(java.lang.String filepath)
      Check if a file is an archive
      static boolean isArchiveFileType​(java.lang.String ext)
      Allow checking of a file extention; NO prefix "."
      static boolean isAudio​(java.lang.String filepath)
      Checks file extension of given filepath to see if the format is a known audio type.
      static boolean isImage​(java.lang.String filepath)
      Using Commons getExtension(), determine if the filename represents an image media type.
      static boolean isJSONGzip​(java.lang.String path)
      Tell if the file is JSON/Gzip
      static boolean isPlainText​(java.lang.String filepath)
      Test is a path or file extension ends with .txt NPE if null is passed in.
      static boolean isSpreadsheet​(java.lang.String filepath)
      Simple check if a file is typed as a Spreadsheet Tab-delimited .txt files or .dat files may be valid spreadsheets, however this method does not look inside files.
      static boolean isVideo​(java.lang.String filepath)
      Checks file extension of given filepath to see if the format is a known video type.
      static boolean isWebURL​(java.lang.String link)
      Check if path or URL is a webpage.
      static boolean isWindowsSystem()
      A way of determining OS Beware, OS X has Darwin in its full OS name.
      static java.util.Set<java.lang.String> loadDict​(java.io.InputStream io, boolean case_sensitive)
      The do all method.
      static java.util.Set<java.lang.String> loadDictionary​(java.io.File resourcepath, boolean case_sensitive)
      Load a word list from a file path.
      static java.util.Set<java.lang.String> loadDictionary​(java.lang.String resourcepath, boolean case_sensitive)
      A generic word list loader.
      static java.util.Set<java.lang.String> loadDictionary​(java.net.URL resourcepath, boolean case_sensitive)
      A generic word list loader.
      static boolean makeDirectory​(java.io.File testDir)
      Utility for making dirs
      static boolean makeDirectory​(java.lang.String dir)
      Utility for making dirs
      protected static char normalizeFilenameChar​(char c)
      Tests for valid filename chars for simple normalization A-Z, a-z, _-, 0-9,
      static java.lang.String readFile​(java.io.File filepath)  
      static java.lang.String readFile​(java.io.File fileinput, java.lang.String enc)
      Slurps a text file into a string and returns the string.
      static java.lang.String readFile​(java.lang.String filepath)  
      static java.lang.String readGzipFile​(java.lang.String filepath)  
      static boolean removeDirectory​(java.io.File directory)
      Java oddity - recursive removal of a directory
      static boolean writeFile​(java.lang.String buffer, java.lang.String fname)
      Write file, UTF-8 is default charset here.
      static boolean writeFile​(java.lang.String buffer, java.lang.String fname, java.lang.String enc, boolean append)  
      static boolean writeGzipFile​(java.lang.String text, java.lang.String filepath)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • FileUtility

        public FileUtility()
    • Method Detail

      • writeFile

        public static boolean writeFile​(java.lang.String buffer,
                                        java.lang.String fname)
                                 throws java.io.IOException
        Write file, UTF-8 is default charset here.
        Parameters:
        buffer - text to save
        fname - name of file to save
        Returns:
        status true if file was written
        Throws:
        java.io.IOException - if file had IO errors.
      • writeFile

        public static boolean writeFile​(java.lang.String buffer,
                                        java.lang.String fname,
                                        java.lang.String enc,
                                        boolean append)
                                 throws java.io.IOException
        Parameters:
        buffer - text to save
        fname - name of file to save
        enc - text encoding
        append - if you wish to add to existing file.
        Returns:
        status if written
        Throws:
        java.io.IOException - if file had IO errors.
      • getOutputStream

        public static java.io.OutputStreamWriter getOutputStream​(java.lang.String fname,
                                                                 java.lang.String enc,
                                                                 boolean append)
                                                          throws java.io.IOException
        Caller is responsible for write flush, close, etc.
        Parameters:
        fname - file path
        enc - encoding
        append - true = append data to existing file.
        Returns:
        stream writer
        Throws:
        java.io.IOException - if stream could not be opened
      • getOutputStream

        public static java.io.OutputStreamWriter getOutputStream​(java.lang.String fname,
                                                                 java.lang.String enc)
                                                          throws java.io.IOException
        Caller is responsible for write flush, close, etc.
        Parameters:
        fname - file name
        enc - text encoding
        Returns:
        stream writer
        Throws:
        java.io.IOException - if stream could not be openeed
      • getInputStreamReader

        public static java.io.InputStreamReader getInputStreamReader​(java.io.File f,
                                                                     java.lang.String enc)
                                                              throws java.io.IOException
        Getting an input stream from a file.
        Parameters:
        f - file object
        enc - encoding of text data
        Returns:
        reader
        Throws:
        java.io.IOException - if failure reading file or using encoding.
      • getInputStream

        public static java.io.InputStreamReader getInputStream​(java.lang.String fname,
                                                               java.lang.String enc)
                                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • getInputStream

        public static java.io.InputStreamReader getInputStream​(java.io.File f,
                                                               java.lang.String enc)
                                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • isSpreadsheet

        public static boolean isSpreadsheet​(java.lang.String filepath)
        Simple check if a file is typed as a Spreadsheet Tab-delimited .txt files or .dat files may be valid spreadsheets, however this method does not look inside files.
        Parameters:
        filepath - path to file
        Returns:
        true if file represents one of the various spreadsheet file formats
      • isImage

        public static boolean isImage​(java.lang.String filepath)
        Using Commons getExtension(), determine if the filename represents an image media type.
        Parameters:
        filepath - path to file
        Returns:
        if file represents any type of image
      • isVideo

        public static boolean isVideo​(java.lang.String filepath)
        Checks file extension of given filepath to see if the format is a known video type.
        Parameters:
        filepath - file name or path
        Returns:
        true if file is likely an video file format.
      • isAudio

        public static boolean isAudio​(java.lang.String filepath)
        Checks file extension of given filepath to see if the format is a known audio type.
        Parameters:
        filepath - file name or path
        Returns:
        true if file is likely an audio file format.
      • isArchiveFile

        public static boolean isArchiveFile​(java.lang.String filepath)
        Check if a file is an archive
        Parameters:
        filepath - path to file
        Returns:
        boolean true if file ends with .zip, .tar, .tgz, .gz (includes .tar.gz)
      • isArchiveFileType

        public static boolean isArchiveFileType​(java.lang.String ext)
        Allow checking of a file extention; NO prefix "."
        Parameters:
        ext - extension to test
        Returns:
        boolean true if file ends with .zip, .tar, .tgz, .gz (includes .tar.gz)
      • isPlainText

        public static boolean isPlainText​(java.lang.String filepath)
        Test is a path or file extension ends with .txt NPE if null is passed in.
        Parameters:
        filepath - path or extension, including "."
        Returns:
        true if is .txt or .TXT
      • readFile

        public static java.lang.String readFile​(java.lang.String filepath)
                                         throws java.io.IOException
        Parameters:
        filepath - path to file
        Returns:
        buffer from file
        Throws:
        java.io.IOException - on error
      • readFile

        public static java.lang.String readFile​(java.io.File filepath)
                                         throws java.io.IOException
        Parameters:
        filepath - path to file
        Returns:
        buffer from file
        Throws:
        java.io.IOException - on error
      • readFile

        public static java.lang.String readFile​(java.io.File fileinput,
                                                java.lang.String enc)
                                         throws java.io.IOException
        Slurps a text file into a string and returns the string.
        Parameters:
        fileinput - file object
        enc - text encoding
        Returns:
        buffer from file
        Throws:
        java.io.IOException - on error
      • readGzipFile

        public static java.lang.String readGzipFile​(java.lang.String filepath)
                                             throws java.io.IOException
        Parameters:
        filepath - path to file
        Returns:
        text buffer, UTF-8 decoded
        Throws:
        java.io.IOException - on error
      • writeGzipFile

        public static boolean writeGzipFile​(java.lang.String text,
                                            java.lang.String filepath)
                                     throws java.io.IOException
        Parameters:
        text - buffer to write
        filepath - path to file
        Returns:
        status true if file was written
        Throws:
        java.io.IOException - on error
      • makeDirectory

        public static boolean makeDirectory​(java.io.File testDir)
                                     throws java.io.IOException
        Utility for making dirs
        Parameters:
        testDir - dir to test
        Returns:
        if directory was created or if it already exists
        Throws:
        java.io.IOException - if testDir was not created
      • makeDirectory

        public static boolean makeDirectory​(java.lang.String dir)
                                     throws java.io.IOException
        Utility for making dirs
        Parameters:
        dir - dirPath
        Returns:
        if directory was created or if it already exists
        Throws:
        java.io.IOException - if testDir was not created
      • removeDirectory

        public static boolean removeDirectory​(java.io.File directory)
        Java oddity - recursive removal of a directory
        Parameters:
        directory - dir to remove
        Returns:
        if all contents and dir itself was removed.
      • generateUniquePath

        public static java.lang.String generateUniquePath​(java.lang.String D,
                                                          java.lang.String F,
                                                          java.lang.String Ext)
        Generate some path with a unique date/time stamp
        Parameters:
        D - directory
        F - filename
        Ext - file extension
        Returns:
        unique path
      • generateUniqueFilename

        public static java.lang.String generateUniqueFilename​(java.lang.String F,
                                                              java.lang.String Ext)
        Generate some filename with a unique date/time stamp
        Parameters:
        F - filename
        Ext - file extension
        Returns:
        unique filename
      • getParent

        public static java.io.File getParent​(java.io.File f)
        Parameters:
        f - the file in question.
        Returns:
        the parent File of a given file.
      • getFilenameFilter

        public static java.io.FilenameFilter getFilenameFilter​(java.lang.String ext)
        Simple filter
        Parameters:
        ext - the extension to filter on
        Returns:
        filename filter
      • getBasename

        public static java.lang.String getBasename​(java.lang.String p,
                                                   java.lang.String ext)
        get the base name of a file, given any file extension. This will find the right-most instance of a file extension and return the left hand side of that as the file basename. commons io FilenameUtils says nothing about arbitrarily long file extensions, e.g., file.a.b.c.txt split into ("file" + "a.b.c.txt")
        Parameters:
        p - path
        ext - extension
        Returns:
        basename of path, less the extension
      • getValidFilename

        public static java.lang.String getValidFilename​(java.lang.String path)
        On occasion file path may contain unicode chars, however as the is encoded, it may not be decodable by OS/FS.
        Parameters:
        path - path to normalize
        Returns:
        filename
      • filenameCleaner

        public static java.lang.String filenameCleaner​(java.lang.String fname)
        Another utility to deal with unicode in filenames
        Parameters:
        fname - name to clean
        Returns:
        cleaner filenname
      • getSafeDir

        public static java.io.File getSafeDir​(java.io.File dir,
                                              java.lang.String dupeMarker,
                                              int maxDups)
        Get a directory that does not conflict with an existing directory. Returns null if that is not possible within the maxDups.
        Parameters:
        dir - directory
        dupeMarker - incrementor
        maxDups - max incrementor
        Returns:
        file object
      • getSafeFile

        public static java.io.File getSafeFile​(java.io.File f,
                                               java.lang.String dupeMarker,
                                               int maxDups)
        Parameters:
        f - file obj
        dupeMarker - incrementor
        maxDups - max incrementor
        Returns:
        new file
      • normalizeFilenameChar

        protected static char normalizeFilenameChar​(char c)
        Tests for valid filename chars for simple normalization A-Z, a-z, _-, 0-9,
        Parameters:
        c - character to allow
        Returns:
        given character or replacement char
      • isWindowsSystem

        public static boolean isWindowsSystem()
        A way of determining OS Beware, OS X has Darwin in its full OS name.
        Returns:
        if OS is windows-based
      • loadDictionary

        public static java.util.Set<java.lang.String> loadDictionary​(java.lang.String resourcepath,
                                                                     boolean case_sensitive)
                                                              throws java.io.IOException
        A generic word list loader.
        Parameters:
        resourcepath - classpath location of a resource
        case_sensitive - if terms are loaded with case preserved or not.
        Returns:
        Set containing unique words found in resourcepath
        Throws:
        java.io.IOException - on error, resource does not exist
      • loadDictionary

        public static java.util.Set<java.lang.String> loadDictionary​(java.net.URL resourcepath,
                                                                     boolean case_sensitive)
                                                              throws java.io.IOException
        A generic word list loader.
        Parameters:
        resourcepath - classpath location of a resource
        case_sensitive - if terms are loaded with case preserved or not.
        Returns:
        Set containing unique words found in resourcepath
        Throws:
        java.io.IOException - on error, resource does not exist
      • loadDict

        public static java.util.Set<java.lang.String> loadDict​(java.io.InputStream io,
                                                               boolean case_sensitive)
                                                        throws java.io.IOException
        The do all method. Load the dictionary from stream This closes the stream when done.
        Parameters:
        io - stream
        case_sensitive - true if data should be loaded preserving case
        Returns:
        set of phrases from file.
        Throws:
        java.io.IOException - on IO error
      • loadDictionary

        public static java.util.Set<java.lang.String> loadDictionary​(java.io.File resourcepath,
                                                                     boolean case_sensitive)
                                                              throws java.io.IOException
        Load a word list from a file path.
        Parameters:
        resourcepath - File object to load
        case_sensitive - if dictionary is loaded with case or not.
        Returns:
        a Set object containing distinct dictionary terms
        Throws:
        java.io.IOException - if load fails
      • getFileDescription

        public static java.lang.String getFileDescription​(java.lang.String url)
        Get a plain language name of the type of file. E.g., document, image, spreadsheet, web page. Rather than the MIME type technical descriptor.
        Parameters:
        url - item to describe
        Returns:
        plain language description of the URL
      • isWebURL

        public static boolean isWebURL​(java.lang.String link)
        Check if path or URL is a webpage. This is helpful for looking at found URLs in unstructured data.
        Parameters:
        link - a URL
        Returns:
        true if link looks like a URL (ie., if it starts with http: or https:)
      • isJSONGzip

        public static boolean isJSONGzip​(java.lang.String path)
        Tell if the file is JSON/Gzip
        Parameters:
        path - input file path
        Returns:
        true if is file ends with json.gz or contains json and ends with .gz