Formalizing the commonsense knowledge needed for even simple reasoning problems is a huge undertaking. For this reason, researchers often study small toy problems, such as planning in the blocks world domain. Because such toy problems can gloss over some of the more interesting research issues, there has been a recent trend toward working on more realistic challenge problems. This page contains a collection of these challenge problems, solutions to some of these problems and some other useful links.
The Common Sense Problem Page was originally created by Rob Miller. It is currently maintained by Leora Morgenstern. Please send email to email@example.com if you would like to contribute additional problems, solutions, or suggestions.
Some challenge commonsense problems are listed below. The full text for these problems can be found on this page; you can either scroll down the page or click on the highlighted links.
Many of these problems are listed together with a number of variants. An acceptable solution to a problem should
Solving one of these challenge problems should result in the discovery of new representational issues and problems that would not appear in an artificially small toy problem. If one encounters no difficulty along the way, one should be suspicious of the adequacy of the solution. Indeed, many of these problems are more difficult than they appear at first glance. Ernest Davis, who contributed many of these problems, has suggested that very few of these problems are currently solvable without considerable simplification. Two problems that he believes are solvable are The Surprise Birthday Present and the first half of Wolves and Rabbits.
Note that the categorization below is approximate. Many problems could be placed in more than one category.
A says that he witnessed B murdering C.
Infer that the evidence that B actually did murder C is stronger:
Drew McDermott hears about some dastardly deed. What conclusion should be drawn in each of the following cases:
The "something else" is one of the following list:
Develop a theory justifying the following:
If you put a half dozen rabbits in a pen and care for them suitably for a period of a few months, you will generally end up with more than a half dozen rabbits in the pen. If, however, you fail to feed them, then you will end up with no (live) rabbits in the pen. If they are all of one sex, and none of the rabbits is pregnant to start with, you will end up with no more than a half dozen rabbits no matter how long you wait.
If you put a couple of wolves with a half dozen rabbits in a pen overnight, then in the morning, you will have two wolves and no rabbits. If, however, the wolves are chained by a short chain at one end of the pen, you will probably have as many animals in the morning as you started with. A metal chain will work for this purpose; a rope is not reliable.
Here's a challenge problem that I think would be worth working on. I think it's hard, I think it's doable, and I think it would be a significant advance.
This is from a recent article "Scientific Thinking in Young Children: Theoretical Advance, Empirical Research, and Policy Implications," by Alison Gopnik, Science 337, 2012, 1623-1627.
An experimenter took frogs from a box of all frogs or else took frogs from a box of almost all ducks. Then she left the room, and another experimenter gave the child [20 months] a small bowl of frogs and a separate bowl of ducks. When the original experimenter returned, she extended her hand ambiguously between the bowls. The children could give her either a frog or a duck. When she had taken frogs from a box of all frogs, children were equally likely to give her a frog or a duck. When she had taken frogs out of the box that was almost all ducks, children gave her a frog. In the first case, the children concluded that she had merely drawn a random sample from the box, but in the second case they concluded that she had displayed a preference for frogs.
Incidentally, I'm not endorsing the article as a whole, but the experiment is certainly interesting.
Sam got straight C's in high school math and has not thought for a moment about math in the 20 years since. Infer that Sam is not the person to ask about a calculus problem.
Note that this solution focuses more on the issue of elaboration tolerance than the naive psychology aspects of the problem.
You are riding Black Beauty (a well-trained horse) in the dark, and you come to a bridge that he has often crossed before. Black Beauty absolutely refuses to set foot on the bridge. Infer that something may be wrong with the bridge.
As an addendum to Ernie Davis's 'Trusting the Horse' problem, here's a real incident from my own youth. Black Beauty (in this version, a mare), while walking, develops a limp and is reluctant to step on her front leg, holding that hoof slightly above the ground. Her driver, who claims to be an expert on horses, tries to force her to lift her front right leg, on the grounds that a horse placing too much weight on a hoof is often a sign of a problem in that hoof.
Infer that the driver is a fool.
When baking cookies, after you prepare the cookie dough, you lightly spread flour over a large flat surface; then roll out the dough on the surface with a rolling pin; then cut out cookie shapes with a cookie cutter; then put the separated cookies separately onto a cookie sheet and bake.
What happens if: You do not flour the surface? You use too much flour? You do not roll out the dough, but cut the cookies from the original mass? You roll out the dough but don't cut it? You cut the dough but don't separate the pieces?
What happens if the surface is covered with sand? Or covered with sandpaper? If the rolling pin has bumps? or cavities? or is square? If the cookie cutter does not fit within the dough? What happens if you use the rolling pin just in the middle of the dough and leave the edges alone? If, rather than roll, you pick up the rolling pin and press it down into the dough in various spots? Ordinarily the cutting part of the cookie cutter is a thin vertical wall above a simple closed curve in the plane; suppose it is not thin? or not vertical? or not closed? or a multiple curve? If the cuts with the cutter overlap one another?
Does the dough end up thinner or thicker if you exert more force on the rolling pin? If you roll it out more times? If you roll the pin faster or slower? Do you get more or fewer cookies if the dough is rolled thinner? If a larger cookie cutter is used? If there is more dough? If the cuts with the cutter are spread further apart?
What is the point of placing waxed paper on the surface? What happens if the above procedure is tried with a recipe for drop cookies? bar cookies? refrigerator cookies?
Characterize the following:
A cook is cracking a raw egg against a glass bowl. Properly performed, the impact of the egg against the edge of the bowl will crack the eggshell in half. Holding the egg over the bowl, the cook will then separate the two halves of the shell with his fingers, enlarging the crack, and the contents of the egg will fall gently into the bowl. The end result is that the entire contents of the egg will be in the bowl, with the yolk unbroken, and that the two halves of the shell are held in the cook's fingers.
What happens if: The cook brings the egg to impact very quickly? Very slowly? The cook lays the egg in the bowl and exerts steady pressure with his hand? The cook, having cracked the egg, attempts to peel it off its contents like a hard-boiled egg? The bowl is made of loose-leaf paper? of soft clay? The bowl is smaller than the egg? The bowl is upside down? The cook tries this procedure with a hard-boiled egg? With a coconut? With an M & M?
Characterize the following:
The following experiment can be used to estimate absolute zero using household objects. Prepare a pot of boiling water and a pot of ice water. Take a graduated baby bottle and hold it (using tongs) in the boiling water. After a few minutes, when it has stopped bubbling, remove it and plunge it rapidly into the ice water. Water will then stream into the baby bottle through the nipple, as the gas contracts. (Actually, the nipple collapses: to allow the flow of water, you have to manipulate the nipple.) When the flow of water stops, the volume of the water that has entered the bottle may be measured by holding the bottle right-side up; the final volume of the gas at 0 degrees C may be measured by holding the bottle upside down. The initial volume of the gas at 100 degrees C is the sum of the final volume of the gas plus the volume of the water. By doing a linear extrapolation between these two values to the point where the volume of the gas would be zero, one can find the value of absolute zero.
What would happen: If the bottle is immersed only very briefly in the hot water? Or only very briefly in the cold water? If it is laid on top of the pots of water rather than immersed in them? If the bottle is left in the outside air a long time between being in the hot water and being in the ice water? If the bottle has an open end with no nipple? If the bottle has other holes besides this nipple? If the bottle is opaque? If you use containers with air at 100 degrees and 0 degrees rather than water? If the quantity of ice water in the second pot is very small? very large? or if the quantity of hot water in the first pot is very small or very large? If the bottle is coated with Styrofoam? If the bottle is not graduated? Why is the following not a reasonable experiment: "Take a volume of gas in your hands; cool it; see how much it shrinks."
In connection with Ernie Davis's 'Absolute Zero' problem, here's an experiment with a very counter-intuitive outcome. The apparatus is two containers, with one fitting loosely into the other. Hot water is placed into the inner one, and iced water into the outer one, forming a cooling jacket. The experiment measures how long it takes for the water in the inner container to cool to room temperature. If the initial temperature of the hot (inner) water is very high, it cools to room temperature in less time than if its initial temperature is lower.
Why is this surprising? More difficult, and maybe outside 'common sense,' what explanation could be given for it?
Consider dropping the following objects on the floor from a height of five feet:
Develop a theory that connects the final state of these being dropped and their behavior while falling to their other material properties.
Formally characterize the structure of a linked chain, and infer (a) that pulling on one end will cause the whole chain to follow; (b) that the chain is very flexible; (c) that cutting one link will give two shorter chains and that linking two chains together end to end gives a longer chain.
It is necessary to walk several hundred yards in rain. Explain why if the rain is moderate then one should run, but not if one has an umbrella; but if the rain is very heavy then running is of no use unless one has an umbrella, and even then it is best to hurry; and if there is also a strong wind one is likely to get more wet than if not, even with an umbrella.
Characterize the following physical operation:
A gardener who has valuable plants with long delicate stems protects them against the wind by staking them; that is, by plunging a stake into the ground near them and attaching the plants to the stake with string.
What would happen: If the stake is only placed upright on the ground, not stuck into the ground? If the string were attached only to the plant, not to the stake? To the stake, but not to the plant? If the plant is growing out of rock? Or in water? If, instead of string, you use a rubber band? Or a wire twist-tie? Or a light chain? Or a metal ring? Or a cobweb? If instead of tying the ends of the string, you twist them together? Or glue them? Or place them side by side? If you use a large rock rather than a stake? If the string is very much longer, or very much shorter, than the distance from the stake to the plant? If the distance from the stake to the plant is large as compared to the height of the plant? If the stake is also made out of string? Trees are sometimes blown over in heavy storms; can they be staked against this?
A humanoid robot is flying economy class on a major airline and is required to "eat" the packaged meal that has been served to it. Like its fellow human travelers, the robot can be assumed to be in a standard seat and to have two arms which function similarly to theirs, with similar restrictions on mobility; e.g. because of the cramped conditions, the robot's elbows have to remain close to its chest. In front of the robot is the familiar small table, occupied almost entirely by the tray containing the meal, neatly packaged in little plastic containers with transparent lids, along with a small plastic cup containing a foil-sealed tub of water, and a cellophane envelope containing a set of plastic cutlery, napkin, condiments, etc. For simplicity, assume that eating can be taken to consist of manipulating the food and drink to the robot's mouth, where the utensils are emptied at typical human diner rate. The robot is conventional in eating habits; thus, it tries at all times to not spill, to use the appropriate utensils, and to obey conventions as to when it is permissible to eat with its fingers (chicken, no; asparagus, yes). Moreover, it begins its meal with the starter, follows this with the main course (along with the mini roll which it has spread with butter), then it has dessert, and finally the cheese and biscuits. To complicate matters, the robot drinks water at various stages of the meal. Everything must be kept on the tray or table, including the packaging for the plastic cutler, the tops of containers, and the containers and their contents. So, like its human companions, the robot quickly becomes involved in an elaborate Chess game, continually maneuvering the containers so that the chosen one is in position.
The problem is to formalize some aspect of this problem: e.g. the problem of food manipulation, or of planning how to eat the next part of the meal. (For example, consider the situation if the only way to tear through the plastic wrapping is with a sharp object such as a key, and the robot's keys are either in the back pocket of its trousers, or in its purse on the floor.) Initially this might be done at a fairly abstract level. However, the eventual aim is an epistemologically adequate formalization. Toward this end, formalizing the robot's mental life is interesting. For example, the robot's beliefs, desires, and preferences might lead it to try to eat the portion of processed cheese, and this goal might persist until the robot realizes that it cannot open the cheese, or assuming that it manages to do so, until it decides that the cheese tastes worse than it looks. Those interested in multi-agent systems might care to formalize the arrival of coffee, served by a member of the cabin crew, usually a small tray held over the no-man's land of the adjacent seat, where the robot helps itself to milk and sugar.
The combination of a safe consists of 3 numbers between 1 and 50, with a tolerance of plus/minus two. No one knows what the combination is. Infer (a) that it will not be possible to open the safe using the combination within 5 minutes (unless you are very lucky); (b) that it will be possible in a couple of days work.
It is morning and you recall that you have to take out the garbage tonight. You are afraid that tonight it will slip your mind. Infer that you would do well to write a memo reminding yourself and attach it in some place where you are sure to look tonight.
Alice and Bob want to surprise their sister Carol with a joint present for her birthday, two weeks from now. They therefore go into a closed room to decide on the present and to plan how they will buy it.
The plan will not work:
The following constraints must be satisfied by the solution:
Note that the problem involves a variety of domains: time, a little space and physics, knowledge, perception, naive psychology, multi-agents.
It is 9:00 PM, you are very tired, and you are settling down in a comfortable armchair with a book of conference proceedings. You are supposed to call your mother at 9:30. Infer that you would do well to set the alarm on your watch for 9:25.
(A less violent version of Little Red Riding Hood)
A small girl is walking through a forest to visit her grandmother, and she passes a bush behind which a Wolf is hiding, planning to pounce out and eat her. Just as she gets close, however, the Wolf hears the singing of the woodcutters as they start work nearby. The Wolf therefore decides to stay hidden and not pounce on the little girl after all. The problem is to explain why the Wolf decides to stay behind the bush.
Give a general purpose characterization of what constitutes a handle, in the ordinary sense of door-handle or drawer-handle, which is sufficient to enable one to infer from a qualitative description of the shape of a part of an object whether or not it can be a handle for that object. In particular, it should be possible to infer that a blunt conical projection cannot be a handle, but an inverted conical projection can be; that a simple rectangular projection can be a drawer handle, but not a suitable handle for lifting a heavy object; that a piece of rope attached at one end can be a door handle; and that a hooked or u-shaped projection, or a rope fastened at both ends, can be a handle for almost anything.
There are many ways in which the meaning of a two word noun phrase can be related to the meanings of the individual nouns, and syntax gives little indication of which applies in any given case. Some such phrases are purely idiomatic and must be individually learned (e.g. "tag sale," "mustard gas") but in most cases a speaker who has never seen the particular phrase can figure out its meaning from semantic constraints and commonsense knowledge.
Characterize the commonsense knowledge used in determining that the correct meaning of the following noun phrases is more plausible than any of the alternative readings: