The little tokenizer… (2)

Here is an answer to the previous exercise:

This is written in Haskell but sadly there is no highlighting for it in wordpress.

text = [
	"a",
	"b",
	"c",
	"\td",
	"\te",
	"\t\t\tf",
	"\t\tg",
	"\th"
	]
	
	
data Block
	= Block [Block]
	| Line String
	deriving Show

indentationToList :: [String] -> Int -> [Block]
indentationToList lines tabs 
	| lines == [] = []
	| countTabs (head lines) == tabs = Line (head lines) : (indentationToList (tail lines) tabs)
	| countTabs (head lines) <  tabs = Block (indentationToList (indented) (tabs + 1)) : (indentationToList remaining tabs)
	| otherwise = error "The value of the tabs parameter is greater than the number of tabs in the text!"
   where
		(indented, remaining) = span (\l -> countTabs l > tabs) lines
		countTabs line = length (takeWhile (== '\t') line)

main = return (indentationToList text 0)

Yay …this is a terse piece of code!

Yeah, let’s see what the ‘indentationToList’ function does exactly. It gets as input a list of lines (a part of the text) and a number of tabs. As output it gives a list of blocks.

A list of blocks?

Exactly. And each block in the list can either be a line or a list of sub-blocks itself. Since we cannot have nested lists in Haskell, we use nested blocks!

So, what does ‘indentationToList’ do?

Well, if no lines are provided then it returns an empty list. Quite straightforward. Else what happens depends on the number of tabs of the first line and the tabs argument.

What does the tabs argument mean?

It means: “I am constructing the list of the lines having tabs tabulations”.
If the current line has exaclty tabs tabulations, then the result is a list whose first element is the line followed by the result of ‘indentationToList’ applied on the remaining lines.
If the current line has more than tabs tabultations, then the result is the list whose first element is a sublist obtained by applying ‘indentationToList’ on all lines having more than tabs tabultation, followed by the result of ‘indentationToList’ applied on the remaining lines.

Is that clear?

Yeah, nearly. I just to read back the example again to be sure it’s completely clear now.

Is this finished concerning indentation?

Of course not.

Ok, what’s next?

We will handle unfinished lines and multiline strings.

What about unfinished lines?

Suppose that lines you have……..

Is that all?

…Far from it!!! …but i’ll finish it another day, i’m tired.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: