xoreos
0.0.5
|
Tokenizes a stream. More...
#include <streamtokenizer.h>
Public Types | |
enum | ConsecutiveSeparatorRule { kRuleIgnoreSame, kRuleIgnoreAll, kRuleHeed } |
What to do when consecutive separator are found. More... | |
Public Member Functions | |
StreamTokenizer (ConsecutiveSeparatorRule conSepRule=kRuleHeed) | |
void | addSeparator (uint32 c) |
Add a character on where to split tokens. More... | |
void | addChunkEnd (uint32 c) |
Add a character marking the end of a chunk. More... | |
void | addQuote (uint32 c) |
Add a character able to enclose (quote) separators and chunk ends. More... | |
void | addIgnore (uint32 c) |
Add a character to ignore. More... | |
UString | getToken (SeekableReadStream &stream) |
Parse a token out of the stream. More... | |
size_t | getTokens (SeekableReadStream &stream, std::vector< UString > &list, size_t min=0, size_t max=SIZE_MAX, const UString &def="") |
Parse tokens out of the stream. More... | |
void | findFirstToken (SeekableReadStream &stream) |
Find the first token character, skipping past separators. More... | |
void | skipToken (SeekableReadStream &stream, size_t n=1) |
Skip a number of tokens. More... | |
void | skipChunk (SeekableReadStream &stream) |
Skip to the end of the chunk. More... | |
void | nextChunk (SeekableReadStream &stream) |
Skip past end of chunk characters. More... | |
Private Member Functions | |
bool | isChunkEnd (SeekableReadStream &stream) |
Static Private Member Functions | |
static bool | isIn (uint32 c, const std::list< uint32 > &list) |
Private Attributes | |
ConsecutiveSeparatorRule | _conSepRule |
std::list< uint32 > | _separators |
std::list< uint32 > | _quotes |
std::list< uint32 > | _chunkEnds |
std::list< uint32 > | _ignores |
Tokenizes a stream.
Definition at line 42 of file streamtokenizer.h.
What to do when consecutive separator are found.
Enumerator | |
---|---|
kRuleIgnoreSame | Ignore the repeated separator, but only if it's the same. |
kRuleIgnoreAll | Ignore all repeated separators. |
kRuleHeed | Heed each separator. |
Definition at line 45 of file streamtokenizer.h.
Common::StreamTokenizer::StreamTokenizer | ( | ConsecutiveSeparatorRule | conSepRule = kRuleHeed | ) |
Definition at line 33 of file streamtokenizer.cpp.
void Common::StreamTokenizer::addChunkEnd | ( | uint32 | c | ) |
Add a character marking the end of a chunk.
A chunk end is essentially a higher-order separator. Parsing tokens will stop at chunk end characters and will not move past them. Only a call to nextChunk() will move past a chunk end character.
Definition at line 56 of file streamtokenizer.cpp.
References _chunkEnds, _ignores, _quotes, _separators, and isIn().
Referenced by Aurora::VISFile::load(), Sound::XACTWaveBank_ASCII::load(), Aurora::LYTFile::load(), Graphics::Aurora::Model_NWN::ParserContext::ParserContext(), and Aurora::TwoDAFile::read2a().
void Common::StreamTokenizer::addIgnore | ( | uint32 | c | ) |
Add a character to ignore.
A character that is ignored will never be added to the token. For example, with the ignore character '#' and the separator character ',', the string "fo#o,#bar" will be splut into two tokens: "foo" and "bar".
Definition at line 62 of file streamtokenizer.cpp.
References _chunkEnds, _ignores, _quotes, _separators, and isIn().
Referenced by Aurora::VISFile::load(), Sound::XACTWaveBank_ASCII::load(), Aurora::LYTFile::load(), Graphics::Aurora::Model_NWN::ParserContext::ParserContext(), and Aurora::TwoDAFile::read2a().
void Common::StreamTokenizer::addQuote | ( | uint32 | c | ) |
Add a character able to enclose (quote) separators and chunk ends.
For example, with the quote character '\'' and separator character ',', the string "foo\',\'bar,foo" will be split into two tokens: "foo,bar" and "bar".
Every quote character is handled as if it's the same! So with the quote characters '\'' and '"', the string "foo\',\"bar,foo" will also yield the two tokens "foo,bar" and "bar.
Definition at line 50 of file streamtokenizer.cpp.
References _chunkEnds, _ignores, _quotes, _separators, and isIn().
Referenced by Aurora::TwoDAFile::read2a().
void Common::StreamTokenizer::addSeparator | ( | uint32 | c | ) |
Add a character on where to split tokens.
For example, with the separator character ',', the string "foo,bar" will be split into two tokens: "foo" and "bar".
Several different characters can act as separator characters at the same time.
The ConsecutiveSeparatorRule value signals how consecutive separator characters are handled.
Definition at line 44 of file streamtokenizer.cpp.
References _chunkEnds, _ignores, _quotes, _separators, and isIn().
Referenced by Aurora::VISFile::load(), Sound::XACTWaveBank_ASCII::load(), Aurora::LYTFile::load(), Graphics::Aurora::Model_NWN::ParserContext::ParserContext(), Aurora::TwoDAFile::read2a(), Aurora::TwoDAFile::readHeaders2b(), Aurora::TwoDAFile::readRows2b(), and Aurora::TwoDAFile::skipRowNames2b().
void Common::StreamTokenizer::findFirstToken | ( | SeekableReadStream & | stream | ) |
Find the first token character, skipping past separators.
Position the stream at the first character that is neither a separator or an ignored characted. This is useful if the first token of a chunk might be indented with separator characters.
Definition at line 213 of file streamtokenizer.cpp.
References _ignores, _separators, isIn(), Common::ReadStream::kEOF, Common::SeekableReadStream::kOriginCurrent, Common::ReadStream::readChar(), and Common::SeekableReadStream::seek().
Referenced by Aurora::TwoDAFile::readRows2a().
UString Common::StreamTokenizer::getToken | ( | SeekableReadStream & | stream | ) |
Parse a token out of the stream.
Go through the stream, character by character, collecting characters for a token. Collection will stop on any of these conditions:
When we find a separator character, the stream will be positioned after this character (potentially skipping over following separators depending on the ConsecutiveSeparatorRule value).
When we find a chunk end character, the stream will be positioned before this character. Only a call to nextChunk() will move the stream past it.
Definition at line 68 of file streamtokenizer.cpp.
References _chunkEnds, _conSepRule, _ignores, _quotes, _separators, Common::UString::end(), Common::UString::findFirst(), isIn(), Common::ReadStream::kEOF, Common::SeekableReadStream::kOriginCurrent, kRuleHeed, kRuleIgnoreSame, Common::ReadStream::readChar(), Common::SeekableReadStream::seek(), and Common::UString::truncate().
Referenced by getTokens(), Aurora::TwoDAFile::readHeaders2b(), Aurora::TwoDAFile::readRows2b(), and skipToken().
size_t Common::StreamTokenizer::getTokens | ( | SeekableReadStream & | stream, |
std::vector< UString > & | list, | ||
size_t | min = 0 , |
||
size_t | max = SIZE_MAX , |
||
const UString & | def = "" |
||
) |
Parse tokens out of the stream.
This method calls getToken() repeatedly and collects all tokens into a list.
stream | The stream to parse out of. |
list | The list to parse into. |
min | Minimum number of tokens to parse. |
max | Maximum number of tokens to parse. |
def | Non-existing tokens are assigned this value. |
Definition at line 189 of file streamtokenizer.cpp.
References _conSepRule, Common::UString::empty(), getToken(), isChunkEnd(), and kRuleIgnoreAll.
Referenced by Sound::getFirst(), Aurora::VISFile::load(), Sound::XACTWaveBank_ASCII::load(), Aurora::LYTFile::load(), Graphics::Aurora::ModelNode_NWN_ASCII::load(), Graphics::Aurora::Model_NWN::loadASCII(), Graphics::Aurora::ModelNode_NWN_ASCII::readConstraints(), Aurora::TwoDAFile::readDefault2a(), Graphics::Aurora::ModelNode_NWN_ASCII::readFaces(), Aurora::TwoDAFile::readHeaders2a(), Aurora::TwoDAFile::readRows2a(), Graphics::Aurora::ModelNode_NWN_ASCII::readTCoords(), Graphics::Aurora::ModelNode_NWN_ASCII::readVCoords(), Graphics::Aurora::ModelNode_NWN_ASCII::readWeights(), and Graphics::Aurora::Model_NWN::skipAnimASCII().
|
private |
Definition at line 251 of file streamtokenizer.cpp.
References _chunkEnds, isIn(), Common::ReadStream::kEOF, Common::SeekableReadStream::kOriginCurrent, Common::ReadStream::readChar(), and Common::SeekableReadStream::seek().
Referenced by getTokens().
Definition at line 36 of file streamtokenizer.cpp.
Referenced by addChunkEnd(), addIgnore(), addQuote(), addSeparator(), findFirstToken(), getToken(), isChunkEnd(), nextChunk(), and skipChunk().
void Common::StreamTokenizer::nextChunk | ( | SeekableReadStream & | stream | ) |
Skip past end of chunk characters.
If the next character is a chunk end character, position the stream directly past it. If the next character is not a chunk end character, do nothing.
Definition at line 240 of file streamtokenizer.cpp.
References _chunkEnds, isIn(), Common::ReadStream::kEOF, Common::SeekableReadStream::kOriginCurrent, Common::ReadStream::readChar(), Common::SeekableReadStream::seek(), and skipChunk().
Referenced by Sound::getFirst(), Aurora::VISFile::load(), Sound::XACTWaveBank_ASCII::load(), Aurora::LYTFile::load(), Graphics::Aurora::ModelNode_NWN_ASCII::load(), Graphics::Aurora::Model_NWN::loadASCII(), Graphics::Aurora::ModelNode_NWN_ASCII::readConstraints(), Aurora::TwoDAFile::readDefault2a(), Graphics::Aurora::ModelNode_NWN_ASCII::readFaces(), Aurora::TwoDAFile::readHeaders2a(), Aurora::TwoDAFile::readRows2a(), Graphics::Aurora::ModelNode_NWN_ASCII::readTCoords(), Graphics::Aurora::ModelNode_NWN_ASCII::readVCoords(), Graphics::Aurora::ModelNode_NWN_ASCII::readWeights(), and Graphics::Aurora::Model_NWN::skipAnimASCII().
void Common::StreamTokenizer::skipChunk | ( | SeekableReadStream & | stream | ) |
Skip to the end of the chunk.
The stream will be positioned before the next end chunk.
Definition at line 228 of file streamtokenizer.cpp.
References _chunkEnds, isIn(), Common::ReadStream::kEOF, Common::SeekableReadStream::kOriginCurrent, Common::ReadStream::readChar(), and Common::SeekableReadStream::seek().
Referenced by nextChunk().
void Common::StreamTokenizer::skipToken | ( | SeekableReadStream & | stream, |
size_t | n = 1 |
||
) |
Skip a number of tokens.
Definition at line 223 of file streamtokenizer.cpp.
References getToken().
Referenced by Aurora::TwoDAFile::readRows2a(), and Aurora::TwoDAFile::skipRowNames2b().
|
private |
Definition at line 160 of file streamtokenizer.h.
Referenced by addChunkEnd(), addIgnore(), addQuote(), addSeparator(), getToken(), isChunkEnd(), nextChunk(), and skipChunk().
|
private |
Definition at line 156 of file streamtokenizer.h.
Referenced by getToken(), and getTokens().
|
private |
Definition at line 161 of file streamtokenizer.h.
Referenced by addChunkEnd(), addIgnore(), addQuote(), addSeparator(), findFirstToken(), and getToken().
|
private |
Definition at line 159 of file streamtokenizer.h.
Referenced by addChunkEnd(), addIgnore(), addQuote(), addSeparator(), and getToken().
|
private |
Definition at line 158 of file streamtokenizer.h.
Referenced by addChunkEnd(), addIgnore(), addQuote(), addSeparator(), findFirstToken(), and getToken().