Java - StreamTokenizer wordChars(int low, int hi) method



Description

The Java StreamTokenizer wordChars(int low, int hi) method specifies that all characters c in the range low <= c <= high are word constituents. A word token consists of a word constituent followed by zero or more word constituents or number constituents.

Declaration

Following is the declaration for java.io.StreamTokenizer.wordChars(int low, int hi) method.

public void wordChars(int low, int hi)

Parameters

  • low − The low end of the range.

  • high − The high end of the range.

Return Value

This method does not return a value.

Exception

NA

Example - Usage of StreamTokenizer wordChars(int low, int hi) method

The following example shows the usage of StreamTokenizer wordChars(int low, int hi) method.

StreamTokenizerDemo.java

package com.tutorialspoint;

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.io.Reader;
import java.io.StreamTokenizer;

public class StreamTokenizerDemo {
   public static void main(String[] args) {
      String text = "Hello. This is a text \n that will be split "
         + "into tokens. 1 + 1 = 2";
         
      try {
         // create a new file with an ObjectOutputStream
         FileOutputStream out = new FileOutputStream("test.txt");
         ObjectOutputStream oout = new ObjectOutputStream(out);

         // write something in the file
         oout.writeUTF(text);
         oout.flush();

         // create an ObjectInputStream for the file we created before
         ObjectInputStream ois = new ObjectInputStream(new FileInputStream("test.txt"));

         // create a new tokenizer
         Reader r = new BufferedReader(new InputStreamReader(ois));
         StreamTokenizer st = new StreamTokenizer(r);

         // set letters o- t as word chars
         st.wordChars('o', 't');

         // print the stream tokens
         boolean eof = false;
         
         do {
            int token = st.nextToken();

            switch (token) {
               case StreamTokenizer.TT_EOF:
                  System.out.println("End of File encountered.");
                  eof = true;
                  break;
                  
               case StreamTokenizer.TT_EOL:
                  System.out.println("End of Line encountered.");
                  break;
                  
               case StreamTokenizer.TT_WORD:
                  System.out.println("Word: " + st.sval);
                  break;
                  
               case StreamTokenizer.TT_NUMBER:
                  System.out.println("Number: " + st.nval);
                  break;
                  
               default:
                  System.out.println((char) token + " encountered.");
                  
                  if (token == '!') {
                     eof = true;
                  }
            }
         } while (!eof);

      } catch (Exception ex) {
         ex.printStackTrace();
      }
   }
}

Output

Let us compile and run the above program, this will produce the following result −

Word: AHello.
Word: This
Word: is
Word: a
Word: text
Word: that
Word: will
Word: be
Word: split
Word: into
Word: tokens.
Number: 1.0
+ encountered.
Number: 1.0
= encountered.
Number: 2.0
End of File encountered.

Example - Include digits (0-9) as part of words

The following example shows the usage of StreamTokenizer wordChars(int low, int hi) method.

StreamTokenizerDemo.java

package com.tutorialspoint;

import java.io.IOException;
import java.io.Reader;
import java.io.StreamTokenizer;
import java.io.StringReader;

public class StreamTokenizerDemo {
   public static void main(String[] args) throws IOException {
      String input = "item1 item2 item3";

      Reader reader = new StringReader(input);
      StreamTokenizer tokenizer = new StreamTokenizer(reader);

      tokenizer.wordChars('a', 'z'); // letters
      tokenizer.wordChars('A', 'Z');
      tokenizer.wordChars('0', '9'); // include digits in words
      tokenizer.whitespaceChars(' ', ' '); // treat space as whitespace

      System.out.println("Tokens:");
      while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
         if (tokenizer.ttype == StreamTokenizer.TT_WORD) {
            System.out.println("Word: " + tokenizer.sval);
         }
      }
   }
}

Output

Let us compile and run the above program, this will produce the following result−

Word: item1
Word: item2
Word: item3

Explanation

  • Digits are part of the word token.

  • item1 is not split into item and 1, but kept as a single word.

Example - Treat hyphen - as part of words (e.g., for hyphenated names)

The following example shows the usage of StreamTokenizer wordChars(int low, int hi) method.

StreamTokenizerDemo.java

package com.tutorialspoint;

import java.io.IOException;
import java.io.Reader;
import java.io.StreamTokenizer;
import java.io.StringReader;

public class StreamTokenizerDemo {
   public static void main(String[] args) throws IOException {
      String input = "Jean-Paul Mary-Jane John";

      Reader reader = new StringReader(input);
      StreamTokenizer tokenizer = new StreamTokenizer(reader);

      tokenizer.wordChars('a', 'z');
      tokenizer.wordChars('A', 'Z');
      tokenizer.wordChars('-', '-'); // include hyphen in word
      tokenizer.whitespaceChars(' ', ' ');

      System.out.println("Tokens:");
      while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
         if (tokenizer.ttype == StreamTokenizer.TT_WORD) {
            System.out.println("Word: " + tokenizer.sval);
         }
      }
   }
}

Output

Let us compile and run the above program, this will produce the following result−

Word: Jean-Paul
Word: Mary-Jane
Word: John

Explanation

  • Without including - as a word character, names like Jean-Paul would be split into Jean and Paul.

  • wordChars('-', '-') ensures the hyphen is treated as part of the word.

java_io_streamtokenizer.htm
Advertisements