Baseball Toaster The Juice Blog
Help
Societal Critic at Large: Scott Long
Frozen Toast
Search
Google Search
Web
Toaster
The Juice
Archives

2009
02  01 

2008
12  11  10  09  08  07 
06  05  04  03  02  01 

2007
12  11  10  09  08  07 
06  05  04  03  02  01 

2006
12  11  10  09  08  07 
06  05  04  03  02  01 

2005
12  11  10  09  08  07 
06  05  04  03  02  01 

2004
12  11  10  09  08  07 
06  05  04  03  02  01 

2003
12  11  10  09 
E-mail

scott@scottlongonline.com

Personally On the Juice
Scott Takes On Society
Comedy 101
Kick Out the Jams (Music Pieces)
Even Baseball Stories Here
Link to Scott's NSFW Sports Site
Baseball Is A Language
2004-06-09 14:25
by Ken Arneson

I tried but failed to adequately explain Automata and Formal Grammar in the comments to this post by Will.

You can live a perfectly satisfactory life without understanding Automata Theory. I don't completely understand it myself. But I'm a stubborn sort; I hate to fail, so I'm going to try again here, below the fold.

I'll try to keep it as simple as I can, and use baseball as an example, since that's something we all understand here. I'll show how baseball is, by one definition, a language. If you still don't get it after this, don't worry, be happy.

 
Automata

An automaton is an abstract "machine", which moves from one "state" to another "state".

The Base/Out situation in baseball is a kind of automaton. It has 24 possible states:
0 out, 0 on
0 out, runner on 1st
0 out, runner on 2nd
0 out, runner on 3rd
0 out, runners on 1st & 2nd
0 out, runners on 1st & 3rd
0 out, runners on 2nd & 3rd
0 out, bases loaded
1 out, 0 on
1 out, runner on 1st
1 out, runner on 2nd
1 out, runner on 3rd
1 out, runners on 1st & 2nd
1 out, runners on 1st & 3rd
1 out, runners on 2nd & 3rd
1 out, bases loaded
2 out, 0 on
2 out, runner on 1st
2 out, runner on 2nd
2 out, runner on 3rd
2 out, runners on 1st & 2nd
2 out, runners on 1st & 3rd
2 out, runners on 2nd & 3rd
2 out, bases loaded

When you start an inning, you begin in this state:
0 out, 0 on

One batter later, there are five possible states you could be in:
0 out, 0 on (batter homered)
0 out, runner on 1st
0 out, runner on 2nd
0 out, runner on 3rd
1 out, 0 on

So that's a rule of this Base/Out automaton: from the "0 out, 0 on" state, you can only go to one of these five states.

You can't immediately jump from "0 out, 0 on" to "2 out, bases loaded". You have to go through intermediate states first.

So you have state transition rules like this:
"0 out, 0 on" => "0 out, 0 on" "0 out, 0 on" => "0 out, runner on 1st" "0 out, 0 on" => "0 out, runner on 2nd" "0 out, 0 on" => "0 out, runner on 3rd" "0 out, 0 on" => "1 out, 0 on"

You do NOT have rules like this: "0 out, 0 on" => "2 out, bases loaded"

Suppose you moved from the "0 out, 0 on" state to the "1 out, 0 on" state. From there, you can move on to the following states, and only these states: 1 out, 0 on (batter homered) 1 out, runner on 1st 1 out, runner on 2nd 1 out, runner on 3rd 2 out, 0 on

 

Chomsky's linguistics innovation is to use automata theory for natural languages. Let's explore that.

Just like you started the inning in the "0 out, 0 on" state, you also begin your sentence in a certain state. Suppose you start a sentence with an article, like the word "The". Let's call that the "Article state."

There are rules which govern what kind of "word state" can follow this "Article state". Some examples: You can say "The fox". You can say "The quick" (as in "the quick fox")

You can't say "The the" (music groups aside). You can't say "The jumped" You can't say "The quickly" You can't say "The of"

So English has state transition rules that look like this: Article state => Noun state Article state => Adjective state

But no rules like this: Article state => Article state Article state => Verb state Article state => Adverb state Article state => Preposition state

Just like you can't directly go from "0 out, 0 on" to "2 out, bases loaded" in baseball, you can't go directly from the "Article State" to a verb, adverb, preposition, or another article. You have follow it with either an adjective or a noun. From there, the "Adjective state" and the "Noun state" will have their own rules about what states can follow.

Formal grammars

Formal grammars are how scientists express rules for these automata. It's usually written in a form like this:

S -> AB

which means that S (whatever S is) can be replaced with the sequence "A (whatever A is) followed by B (whatever B is)".

Let's look at an example:

NounPhrase -> Article Noun
This means we've defined a NounPhrase as an "Article" followed by a "Noun". What's an Article?
Article -> {the, a}
We've defined an "Article" as one of two words: "the" or "a". What's a Noun?
Noun -> {pig, dog, bat, base, pitcher, catcher}
We've defined "Noun" as one of these six words.

So with these three grammar rules, we can substitute to create 12 valid NounPhrases: the pig the dog the bat the base the pitcher the catcher a pig a dog a bat a base a pitcher a catcher

 

Now obviously, natural languages are much more complex than this, but these are the basic building blocks you use to describe any language.

Since you can describe baseball using such formal grammar, the game of baseball is, by this definition, a language. It's no wonder writers love baseball so.

Comment status: comments have been closed. Baseball Toaster is now out of business.