https://docs.python.org/3/library/tokenize.html
https://www.techopedia.com/definition/13698/tokenization
Tokenization is the act of breaking up a sequence of strings into pieces such as words, keywords, phrases, symbols and other elements called tokens. Tokens can be individual words, phrases or even whole sentences. In the process of tokenization, some characters like punctuation marks are discarded. The tokens become the input for another process like parsing and text mining.
Tokenization is used in computer science, where it plays a large part in the process of lexical analysis.
Lexical analysis:
No comments:
Post a Comment