So what's it do? I see a bunch of AI buzzwords.

moshe · on Nov 13, 2008

Eventually, learn arbitrary programs from data, e.g.

Input: (learn 'fib '(x) '(((1) 1) ((2) 1) ((3) 2) ((4) 3))) Output (defun fib (n) (if (< n 3) 1 (+ (fib (1- n)) (fib (- n 2)))))

as well as standard machine-learning tasks such as supervised classification.

For what it does right now, see the examples at the bottom of the quick start guide:

http://code.google.com/p/plop/wiki/QuickStart

For more technical background see e.g. http://metacog.org/main.pdf (my dissertation). I will also add a list of relevant publications to the wiki...

jderick · on Nov 13, 2008

Pretty interesting.. actually looks somewhat related to my thesis:

http://www.cs.utexas.edu/~jderick/thesis.pdf

I'll have to look at it more closely when I get a chance.

I was working on learning proofs, rather than programs, but I think there are a lot of similarities.

mnemonicsloth · on Nov 13, 2008

I was working on learning proofs, rather than programs, but I think there are a lot of similarities.

I'm just a mathematician, and my mind has obviously been warped by continua and the Axiom of Choice but aren't they the same thing?

http://en.wikipedia.org/wiki/Curry-Howard_correspondence

Actually, I've gotten interested in learning more about theoretical CS lately, but my everyone in my department is too old/applied/Russian to care, and my university's CS people only seem to care about operating systems and e-commerce. Can you recommend a logical starting place?

ntoshev · on Nov 13, 2008

The notation you'd use to represent the program/proof is very important.

jderick · on Nov 13, 2008

One interesting difference is that if you are learning a proof, then you know when you are successful. At least for my research, I generated a bunch of small cases of the theorem I was trying to prove, then tried to learn the general case. If I was successful, I would have a proof of the original conjecture. When learning programs from some data, you can always test your program against the data but never can be sure that you are generalizing the data properly.

sdp · on Nov 13, 2008

The equivalence seems to hold only when the program is complete. Thus, while all proofs can be expressed as programs, not all programs can be expressed as proofs.

13ren · on Nov 14, 2008

Can't you restrict the domain, and so that the program is complete (for that restricted domain)?

sdp · on Nov 14, 2008

I meant correct, not complete.

I refer to the article on wikipedia which states: "A converse direction is to use a program to extract a proof, given its correctness. This is only feasible if the programming language the program is written for is very richly typed: the development of such type systems has been partly motivated by the wish to make the Curry-Howard correspondence practically relevant."

sdp · on Nov 13, 2008

Programs are equivalent to proofs given their correctness, rather. Additionally, according to the wikipedia article, extracting a proof from a program requires a very richly typed programming language.

moshe · on Nov 13, 2008

Thanks for the pointer. I was actually just looking at ACL the other day - I have a friend here at Google who is investigating integrating it with plop as a 20% project. I'm very excited that your work combines theorem-proving with learning, kudos! Do you do anything probabilistic to decide which generalizations to try?

jderick · on Nov 13, 2008

It is more just a set of techniques for generalizing theorems. At this point in time I would say there are a few roadblocks to doing some kind of data-driven probabilistic search for inductive proofs (like ACL2 does). The first is that there just aren't that many generalization techniques already implemented to choose from. The second is that most provers don't even have the basic backtracking mechanisms built in to support that kind of behavior. Although this may seem surprising, the search space for a typical theorem is so huge (technically infinite, but practically speaking just too complex to bother with) that usually if you don't find a proof right away with a high-confidence technique, then you are probably just going to go off into the weeds.

Now that more processing power than ever is available, I'm sure this will gradually change. But a lot of theorem proving technology is very old (dates to the 70s) so it may take some time.

The main technique I came up with looks at a particular finite case of a conjecture and generates a proof for that case using simple rewriting. You then look for patterns in that proof and use them to create a generalization. As a simple example, imagine your trying to prove a theorem about lists. First, I consider the case where a list is only 4 elements. Since this is a finite case, I can prove this easily with rewriting. Now I look at that proof and notice that rule X was applied 4 times. If I do the same thing when the list has 5 elements, that rule is applied 5 times. So in the general case, apply the rule n times.

One novel thing about that I came across in this work is something I called "Hybrid proof terms". I found that it is very difficult to find patterns in sequences of terms (the typical way of representing a rewrite based proof). So I used this representation that factors out the things that do not change from one line of the proof to the next.

Feel free to email me if you would like to discuss any more details. I'm not really working on this project anymore but I still find it a fascinating area of study.

Also, for your friend who is thinking about working with ACL2, he should know that the acl2-help mailing list is quite friendly and helpful.

cabalamat · on Nov 13, 2008

The problem of working out what a program does given its input and output is a very general one. Maybe there could be a competition for programs that do this? (Like the Loebner Prize, but more sensible since the Loebner Prize doesn't really have anything to do with progress in AI). Or maybe such a competition already exists and I don't know about it.

Perhaps a generalisation of this could be used as an intelligence test for AI programs. (for criteria for such a test, see http://www.overcomingbias.com/2008/10/economic-defini.html )

moshe · on Nov 13, 2008

Well, the difficulty with this is that program induction is in fact too general to be very interesting - to create a competition you would need to publish some test problems, or describe a special distribution of "natural programs" that you especially cared about...

cabalamat · on Nov 13, 2008

> program induction is in fact too general to be very interesting

Intelligence is general and is capable of coping with novel problems. AI therefore invovles building machines which have that level of generality. To say this isn't interesting is to say AI isn't interesting.

If you published exactly the problems that were going to be in the test, programs would be written that would solve those problems but would be very poor at doing anything else.

Anyway here are some input-output pairs that I think would be suitable for a learning program:

(i) append 'x' to the end of the string:

r => rx 123 => 123x

Similarly other insertions, deletions, copies of characters or groups of characters.

(ii) learning Roman or Arabic numerals or other number-coding schemes:

/ => i // => ii /// => iii ///////// => ix

3 => xxx 7 => xxxxxxx 9 => xxxxxxxxx 11 => xxxxxxxxxxx

Ideally, once the program has learnt the above two functions it should have an understanding of the underlying concepts and therefore find it easier to learn functions such as:

iii => 3 xii => 12 xvi => 16

moshe · on Nov 13, 2008

> Intelligence is general and is capable of coping with > novel problems. AI therefore invovles building machines > which have that level of generality. > There is a bias when you think of "programs" to think of programs that a human would actually write or consider useful. But in fact, this is only a tiny (and nontrivial to specify) fraction of the space of "possible programs" - see for example _Foundations of Genetic Programming_, by Langdon and Poli, for some next explorations of general program space. Most possible programs are completely useless and uninteresting to humans (same with proofs, grammars, etc. etc.).

To do AI you have to realize that in the sentences: "Humans have general intelligence and can solve problems in many domains" and "Breadth-first search is a problem-general technique that will always find a solution if one exists" the word "general" does not mean the same thing at all! (cf. the huge literature on inductive bias in human cognition).

For a nice list of program induction problems solvable by search (with a system called ADATE), see http://www-ia.hiof.no/~rolando/Examples/index.html .

cabalamat · on Nov 13, 2008

You're right that most possible programs are useless and uninteresting. So I agree with you that if a learning program was given problems to solve that are generated by generating random programs, that wouldn't be interesting.

I am not suggesting doing that. What I am suggesting is that the problems-to-solve be generated by humans.

Incidently what I am suggesting isn't quite inductive programming but something slightly more general, in that the job of the problem-solver isn't to generate a program that solves the input/output pairs, but to guess what the next output will be depending on its input. (The problem-solver may or may not do this by internally creating a program that solves the problem).

Anyway I've done a quick write-up of what I propose: http://www.includipedia.com/wiki/User:Cabalamat/Function_pre...

cabalamat · on Nov 13, 2008

Oops the formatting got fucked up. What I meant was:

  r => rx
  123 => 123x

and:

  / => i 
  // => ii 
  /// => iii 
  ///////// => ix

  3 => xxx 
  7 => xxxxxxx 
  9 => xxxxxxxxx 
  11 => xxxxxxxxxxx

  iii => 3 
  xii => 12 
  xvi => 16

paraschopra · on Nov 13, 2008

So, its genetic programming.. right?

moshe · on Nov 13, 2008

Well, sort of. Genetic programming with probabilistic modeling and sampling instead of crossover/mutation (i.e. it an estimation-of-distribution algorithm), where programs are represented in reduced (normal) form and which variables to use in the probabilistic models is determined adaptively and changes as the search progresses... ;->

andreyf · on Nov 13, 2008

Genetic programming, with mutations guided by learned patterns from a corpus of data? (or that's what I imagine it to be)

neilc · on Nov 12, 2008

http://googleresearch.blogspot.com/2008/11/plop-probabilisti...