Class FileUtility

java.lang.Object
org.opensextant.util.FileUtility

public class FileUtility extends Object
Author:
ubaldino
  • Field Details

  • Method Details

    • writeFile

      public static boolean writeFile(String buffer, String fname) throws IOException
      Write file, UTF-8 is default charset here.
      Parameters:
      buffer - text to save
      fname - name of file to save
      Returns:
      status true if file was written
      Throws:
      IOException - if file had IO errors.
    • writeFile

      public static boolean writeFile(String buffer, String fname, String enc, boolean append) throws IOException
      Parameters:
      buffer - text to save
      fname - name of file to save
      enc - text encoding
      append - if you wish to add to existing file.
      Returns:
      status if written
      Throws:
      IOException - if file had IO errors.
    • getOutputStream

      public static OutputStreamWriter getOutputStream(String fname, String enc, boolean append) throws IOException
      Caller is responsible for write flush, close, etc.
      Parameters:
      fname - file path
      enc - encoding
      append - true = append data to existing file.
      Returns:
      stream writer
      Throws:
      IOException - if stream could not be opened
    • getOutputStream

      public static OutputStreamWriter getOutputStream(String fname, String enc) throws IOException
      Caller is responsible for write flush, close, etc.
      Parameters:
      fname - file name
      enc - text encoding
      Returns:
      stream writer
      Throws:
      IOException - if stream could not be openeed
    • getInputStreamReader

      public static InputStreamReader getInputStreamReader(File f, String enc) throws IOException
      Getting an input stream from a file.
      Parameters:
      f - file object
      enc - encoding of text data
      Returns:
      reader
      Throws:
      IOException - if failure reading file or using encoding.
    • getInputStream

      public static InputStreamReader getInputStream(String fname, String enc) throws IOException
      Throws:
      IOException
    • getInputStream

      public static InputStreamReader getInputStream(File f, String enc) throws IOException
      Throws:
      IOException
    • isSpreadsheet

      public static boolean isSpreadsheet(String filepath)
      Simple check if a file is typed as a Spreadsheet Tab-delimited .txt files or .dat files may be valid spreadsheets, however this method does not look inside files.
      Parameters:
      filepath - path to file
      Returns:
      true if file represents one of the various spreadsheet file formats
    • isImage

      public static boolean isImage(String filepath)
      Using Commons getExtension(), determine if the filename represents an image media type.
      Parameters:
      filepath - path to file
      Returns:
      if file represents any type of image
    • isVideo

      public static boolean isVideo(String filepath)
      Checks file extension of given filepath to see if the format is a known video type.
      Parameters:
      filepath - file name or path
      Returns:
      true if file is likely an video file format.
    • isAudio

      public static boolean isAudio(String filepath)
      Checks file extension of given filepath to see if the format is a known audio type.
      Parameters:
      filepath - file name or path
      Returns:
      true if file is likely an audio file format.
    • isArchiveFile

      public static boolean isArchiveFile(String filepath)
      Check if a file is an archive
      Parameters:
      filepath - path to file
      Returns:
      boolean true if file ends with .zip, .tar, .tgz, .gz (includes .tar.gz)
    • isArchiveFileType

      public static boolean isArchiveFileType(String ext)
      Allow checking of a file extention; NO prefix "."
      Parameters:
      ext - extension to test
      Returns:
      boolean true if file ends with .zip, .tar, .tgz, .gz (includes .tar.gz)
    • isPlainText

      public static boolean isPlainText(String filepath)
      Test is a path or file extension ends with .txt NPE if null is passed in.
      Parameters:
      filepath - path or extension, including "."
      Returns:
      true if is .txt or .TXT
    • readFile

      public static String readFile(String filepath) throws IOException
      Parameters:
      filepath - path to file
      Returns:
      buffer from file
      Throws:
      IOException - on error
    • readFile

      public static String readFile(File filepath) throws IOException
      Parameters:
      filepath - path to file
      Returns:
      buffer from file
      Throws:
      IOException - on error
    • readFile

      public static String readFile(File fileinput, String enc) throws IOException
      Slurps a text file into a string and returns the string.
      Parameters:
      fileinput - file object
      enc - text encoding
      Returns:
      buffer from file
      Throws:
      IOException - on error
    • readGzipFile

      public static String readGzipFile(String filepath) throws IOException
      Parameters:
      filepath - path to file
      Returns:
      text buffer, UTF-8 decoded
      Throws:
      IOException - on error
    • writeGzipFile

      public static boolean writeGzipFile(String text, String filepath) throws IOException
      Parameters:
      text - buffer to write
      filepath - path to file
      Returns:
      status true if file was written
      Throws:
      IOException - on error
    • makeDirectory

      public static boolean makeDirectory(File testDir) throws IOException
      Utility for making dirs
      Parameters:
      testDir - dir to test
      Returns:
      if directory was created or if it already exists
      Throws:
      IOException - if testDir was not created
    • makeDirectory

      public static boolean makeDirectory(String dir) throws IOException
      Utility for making dirs
      Parameters:
      dir - dirPath
      Returns:
      if directory was created or if it already exists
      Throws:
      IOException - if testDir was not created
    • removeDirectory

      public static boolean removeDirectory(File directory)
      Java oddity - recursive removal of a directory
      Parameters:
      directory - dir to remove
      Returns:
      if all contents and dir itself was removed.
    • generateUniquePath

      public static String generateUniquePath(String D, String F, String Ext)
      Generate some path with a unique date/time stamp
      Parameters:
      D - directory
      F - filename
      Ext - file extension
      Returns:
      unique path
    • generateUniqueFilename

      public static String generateUniqueFilename(String F, String Ext)
      Generate some filename with a unique date/time stamp
      Parameters:
      F - filename
      Ext - file extension
      Returns:
      unique filename
    • getParent

      public static File getParent(File f)
      Parameters:
      f - the file in question.
      Returns:
      the parent File of a given file.
    • getFilenameFilter

      public static FilenameFilter getFilenameFilter(String ext)
      Simple filter
      Parameters:
      ext - the extension to filter on
      Returns:
      filename filter
    • getBasename

      public static String getBasename(String p, String ext)
      get the base name of a file, given any file extension. This will find the right-most instance of a file extension and return the left hand side of that as the file basename. commons io FilenameUtils says nothing about arbitrarily long file extensions, e.g., file.a.b.c.txt split into ("file" + "a.b.c.txt")
      Parameters:
      p - path
      ext - extension
      Returns:
      basename of path, less the extension
    • getValidFilename

      public static String getValidFilename(String path)
      On occasion file path may contain unicode chars, however as the is encoded, it may not be decodable by OS/FS.
      Parameters:
      path - path to normalize
      Returns:
      filename
    • filenameCleaner

      public static String filenameCleaner(String fname)
      Another utility to deal with unicode in filenames
      Parameters:
      fname - name to clean
      Returns:
      cleaner filenname
    • getSafeDir

      public static File getSafeDir(File dir, String dupeMarker, int maxDups)
      Get a directory that does not conflict with an existing directory. Returns null if that is not possible within the maxDups.
      Parameters:
      dir - directory
      dupeMarker - incrementor
      maxDups - max incrementor
      Returns:
      file object
    • getSafeFile

      public static File getSafeFile(File f, String dupeMarker, int maxDups)
      Parameters:
      f - file obj
      dupeMarker - incrementor
      maxDups - max incrementor
      Returns:
      new file
    • normalizeFilenameChar

      protected static char normalizeFilenameChar(char c)
      Tests for valid filename chars for simple normalization A-Z, a-z, _-, 0-9,
      Parameters:
      c - character to allow
      Returns:
      given character or replacement char
    • isWindowsSystem

      public static boolean isWindowsSystem()
      A way of determining OS Beware, OS X has Darwin in its full OS name.
      Returns:
      if OS is windows-based
    • loadDictionary

      public static Set<String> loadDictionary(String resourcepath, boolean case_sensitive) throws IOException
      A generic word list loader.
      Parameters:
      resourcepath - classpath location of a resource
      case_sensitive - if terms are loaded with case preserved or not.
      Returns:
      Set containing unique words found in resourcepath
      Throws:
      IOException - on error, resource does not exist
    • loadDictionary

      public static Set<String> loadDictionary(URL resourcepath, boolean case_sensitive) throws IOException
      A generic word list loader.
      Parameters:
      resourcepath - classpath location of a resource
      case_sensitive - if terms are loaded with case preserved or not.
      Returns:
      Set containing unique words found in resourcepath
      Throws:
      IOException - on error, resource does not exist
    • loadDict

      public static Set<String> loadDict(InputStream io, boolean case_sensitive) throws IOException
      The do all method. Load the dictionary from stream This closes the stream when done.
      Parameters:
      io - stream
      case_sensitive - true if data should be loaded preserving case
      Returns:
      set of phrases from file.
      Throws:
      IOException - on IO error
    • loadDictionary

      public static Set<String> loadDictionary(File resourcepath, boolean case_sensitive) throws IOException
      Load a word list from a file path.
      Parameters:
      resourcepath - File object to load
      case_sensitive - if dictionary is loaded with case or not.
      Returns:
      a Set object containing distinct dictionary terms
      Throws:
      IOException - if load fails
    • getFileDescription

      public static String getFileDescription(String url)
      Get a plain language name of the type of file. E.g., document, image, spreadsheet, web page. Rather than the MIME type technical descriptor.
      Parameters:
      url - item to describe
      Returns:
      plain language description of the URL
    • isWebURL

      public static boolean isWebURL(String link)
      Check if path or URL is a webpage. This is helpful for looking at found URLs in unstructured data.
      Parameters:
      link - a URL
      Returns:
      true if link looks like a URL (ie., if it starts with http: or https:)
    • isJSONGzip

      public static boolean isJSONGzip(String path)
      Tell if the file is JSON/Gzip
      Parameters:
      path - input file path
      Returns:
      true if is file ends with json.gz or contains json and ends with .gz