Word counter

?


?


in room object,
at after entering the room column,
click the code view,
type

get input {
  text = result
  totalLength = LengthOf(text)
  textWithoutSpaces = Replace(text, " ", "")
  lengthWithoutSpaces = LengthOf(textWithoutSpaces)
  spaceCount = totalLength - lengthWithoutSpaces
  msg ("Number of spaces: " + spaceCount)
  numberofwords = spaceCount+1
  msg ("Number of words: " + numberofwords)
}

Test the code,
type in "test test test",
you will get
Number of spaces: 2
Number of words: 3

Obviously, the number of spaces are redundant but I need to showcase you how the method works


Create a function named wordcounter
type in the slim down code without msg player about number of spaces

get input {
  text = result
  totalLength = LengthOf(text)
  textWithoutSpaces = Replace(text, " ", "")
  lengthWithoutSpaces = LengthOf(textWithoutSpaces)
  spaceCount = totalLength - lengthWithoutSpaces
  numberofwords = spaceCount+1
  msg ("Number of words: " + numberofwords)
}

This is for a more flexible wordcounter so you can just call the function whenever you need it


Counting words isn't necessarily the same as counting spaces. This function will give odd counts if you feed it a string with multiple spaces between words; or with spaces at the beginning or end. Or punctuation marks instead of spaces.

Here's a (somewhat slower) way to count words in a string:

<function name="CountWords" parameters="input" type="int"><![CDATA[
  result = 0
  while (IsRegexMatch ("\\w", input)) {
    result = result + 1
    split = Populate ("^\\W*\\w++\\W*(?<remainder>.*)", input, "firstword")
    input = DictionaryItem (split, "remainder")
  }
  return (result)
]]></function>

This uses the regular expression patterns \w++ (matches any complete word) and \W* (matches a block of nonword characters - including spaces and punctuation). So the call to IsRegexMatch checks if there is a word character (\w, any letter or digit) in the string. If so, Populate removes the first word and any spaces/punctuation from either side of it, and stores the part of the string that still needs to be counted in the remainder subpattern.


I changed mrangel's coding into the following copy and paste code view code,
I am not sure if I did it right,

  1. I rearranged the 'input' and 'result' as quest app recognize 'result' instead
  2. I added in get input {} function
  3. Quest app sounds an error, so I changed \++ to \+
get input {
  text = result
  count = 0
  while (IsRegexMatch("\\w", text)) {
    count = count + 1
    split = Populate("^\\W*\\w+\\W*(?<remainder>.*)", text, "firstword")
    text = DictionaryItem(split, "remainder")
  }
  msg ("Number of words: " + count)
}

You could also probably add his function, then just use it in your code like this:

get input {
  msg("Number of words: " + CountWords(result))
}

Huh… an error? Does that mean the regex engine that Quest uses doesn't support possessive (sticky) quantifiers?
I don't think it'll make any difference in this case, \w+ should work just as well. But I thought that had been part of the regex standard for a very long time.

In anyone is wondering about the distinction:

  • \w matches a single word character
  • \w+ matches one or more word characters
  • \w++ matches one or more word characters that are not followed by any more word characters
  • (\W is the opposite of \w, matching non-word characters in the same way that \S matches non-space characters, and \D matches non-digits)

EDIT: I initially copied and pasted the wrong thing here.

This is nifty.

I made that one change and wrapped it in <![CDATA[[]]>, and it seems to work flawlessly.


Gng

\w matches a single word character
\w+ matches one or more word characters
\w++ matches one or more word characters that are not followed by any more word characters
(\W is the opposite of \w, matching non-word characters in the same way that \S matches non-space characters, and \D matches non-digits)

We need more tutorials on regular expression ._.


Log in to post a reply.

Support

Forums