Class AsciiFileParser

java.lang.Object
uk.ac.starlink.util.AsciiFileParser

public class AsciiFileParser extends Object
Generalised parser for data stored as a table in a plain text file. The following assumptions are made about the structure of these files:
  • They may have comments which are whole line and in-line. Comments are indicated by a single character. Totally empty lines are ignored.
  • The number of fields (i.e. columns) in the table may be fixed it can be an error if this isn't true.
  • The data formats of each field are not known and can be requested for conversion to various formats.
  • Fields are separated by a given character (i.e. space, comma, tab) and multiple repeats of these are contracted to one instance (so multiple spaces are just one separator).
This class is only suitable for files that are expected to contain small numbers of data.
Version:
$Id$
Author:
Peter W. Draper
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected String
    The permissible delimeters between fields.
    protected boolean
    Whether the number of fields is fixed.
    protected char
    The character used for inline comments.
    protected int
    The number of fixed fields in the file.
    protected ArrayList<String[]>
    A list that contains arrays of each set of Strings parsed from each row.
    protected char
    The character used for single-line comments.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Create an instance.
    AsciiFileParser(boolean fixed)
    Create an instance.
    Create an instance and parse a given File.
    AsciiFileParser(File file, boolean fixed)
    Create an instance and parse a given File.
  • Method Summary

    Modifier and Type
    Method
    Description
    protected void
    decode(File file)
    Open, read and decode the contents of the file.
    boolean
    getBooleanField(int row, int column)
    Get the boolean value of a field.
    Get the character used as field delimeters.
    double
    getDoubleField(int row, int column)
    Get the double precision value of a field.
    char
    Get the character used for in-line comments.
    int
    getIntegerField(int row, int column)
    Get the integer value of a field.
    int
    Get the number of fields located in the file.
    int
    getNFields(int row)
    Get the number of fields in a row.
    int
    Get the number of rows located in the file.
    getRow(int row)
    Get the parsed Strings in a row.
    char
    Get the character used for single line comments.
    getStringField(int row, int column)
    Get the String value of a field.
    boolean
    Get whether the file is expected to have a fixed number of fields.
    void
    parse(File file)
    Parse a file using the current configuration.
    void
    Set the characters used as field delimeters.
    void
    setFixed(boolean fixed)
    Set whether the file is expected to have a fixed number of fields.
    void
    setInlineCommentChar(char inlineComment)
    Set the character used for in-line comments.
    void
    setSingleCommentChar(char singleComment)
    Set the character used for single line comments.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • fixed

      protected boolean fixed
      Whether the number of fields is fixed.
    • nFields

      protected int nFields
      The number of fixed fields in the file.
    • rowList

      protected ArrayList<String[]> rowList
      A list that contains arrays of each set of Strings parsed from each row.
    • singleComment

      protected char singleComment
      The character used for single-line comments. Defaults to #.
    • inlineComment

      protected char inlineComment
      The character used for inline comments. Defaults to !.
    • delims

      protected String delims
      The permissible delimeters between fields. The defaults are from StringTokenizer: " \t\n\r\f", the space character, the tab character, the newline character, the carriage-return character, and the form-feed character.
  • Constructor Details

    • AsciiFileParser

      public AsciiFileParser()
      Create an instance.
    • AsciiFileParser

      public AsciiFileParser(boolean fixed)
      Create an instance.
      Parameters:
      fixed - whether fixed format is required.
    • AsciiFileParser

      public AsciiFileParser(File file)
      Create an instance and parse a given File.
      Parameters:
      file - reference a File that describes the input file.
    • AsciiFileParser

      public AsciiFileParser(File file, boolean fixed)
      Create an instance and parse a given File.
      Parameters:
      file - reference a File that describes the input file.
      fixed - whether fixed format is required.
  • Method Details

    • setFixed

      public void setFixed(boolean fixed)
      Set whether the file is expected to have a fixed number of fields.
      Parameters:
      fixed - whether fixed format is required.
    • isFixed

      public boolean isFixed()
      Get whether the file is expected to have a fixed number of fields.
      Returns:
      true if a fixed number of fields is expected.
    • parse

      public void parse(File file)
      Parse a file using the current configuration.
      Parameters:
      file - reference a File that describes the input file.
    • getNFields

      public int getNFields()
      Get the number of fields located in the file. If not fixed this is the minimum.
    • getNFields

      public int getNFields(int row)
      Get the number of fields in a row.
    • getNRows

      public int getNRows()
      Get the number of rows located in the file.
    • getRow

      public String[] getRow(int row)
      Get the parsed Strings in a row.
    • getStringField

      public String getStringField(int row, int column)
      Get the String value of a field.
      Parameters:
      row - the row index of the field required.
      column - the column index of the field required.
      Returns:
      the field value if available, otherwise null.
    • getIntegerField

      public int getIntegerField(int row, int column)
      Get the integer value of a field.
      Parameters:
      row - the row index of the field required.
      column - the column index of the field required.
      Returns:
      the field value if available, otherwise 0.
    • getDoubleField

      public double getDoubleField(int row, int column)
      Get the double precision value of a field.
      Parameters:
      row - the row index of the field required.
      column - the column index of the field required.
      Returns:
      the field value if available, otherwise 0.0.
    • getBooleanField

      public boolean getBooleanField(int row, int column)
      Get the boolean value of a field. Any string starting with "t" or "T" is considered true, otherwise the value is false.
      Parameters:
      row - the row index of the field required.
      column - the column index of the field required.
      Returns:
      true or false
    • setSingleCommentChar

      public void setSingleCommentChar(char singleComment)
      Set the character used for single line comments.
    • getSingleCommentChar

      public char getSingleCommentChar()
      Get the character used for single line comments.
    • setInlineCommentChar

      public void setInlineCommentChar(char inlineComment)
      Set the character used for in-line comments.
    • getInlineCommentChar

      public char getInlineCommentChar()
      Get the character used for in-line comments.
    • setDelimeters

      public void setDelimeters(String delims)
      Set the characters used as field delimeters.
      Parameters:
      delims - list of characters to be used as field delimiters.
    • getDelimeters

      public String getDelimeters()
      Get the character used as field delimeters.
      Returns:
      the delimeter string, if set, null if defaults apply.
    • decode

      protected void decode(File file)
      Open, read and decode the contents of the file.
      Parameters:
      file - reference a File that describes the input file.