Cohort and case-control studies
A cohort study assembles a group of patients and follows them over
time. An example is the Nurses' Health Study, in which over 20,000 nurses were identified
and followed-up annually with tests and surveys for over 25 year (this study is still
ongoing). These studies provide very valuable information, but are obviously very
expensive and time-consuming.
In most cohort studies, you want to assemble patients without the disease in question,
and then follow them until they develop the disease. By comparing the characteristics of
patients with and without disease, you can identify risk factors. A study to identify risk
factors for breast cancer is diagrammed below:

In a cohort study of prognosis, you would continue to follow patients after
they develop the disease in question to see how they do regarding mortality, disease
progression, and other important outcomes. Therefore, cohort studies of prognosis don't
necessarily begin with a group of disease-free patients. Because they don't, it is
important that patients either be at the same stage of the disease at the start of the
study, or that you are able to precisely characterize the stage of the disease. A study to
determine the prognosis for patients with colon cancer is shown below:

Alternately, you can enroll all patients with colon cancer, being
careful to classify them at the beginning of the study:

When it is important to compare the prognosis for patients with disease to those
without, a group of patients without the disease in question is sometimes followed in a
cohort study. Consider the example of well differentiated prostate cancer described
earlier: the best way to know that these patients have the same prognosis as members of
the population without prostate cancer is to follow a similar group of patients without
prostate cancer for a long period, perhaps 10 or 20 years..
While cohort studies are considered the best way do a study about prognosis, you can
also use a case-control design. While cohort studies are prospective (patients are
followed forward in time), case-control studies are retrospective (looking back in time).
Patients with a disease are identified who have suffered a bad outcome such as death or
recurrence, and compared with patients who have the disease but haven't suffered the bad
outcome. For example, a researcher might identify a group of breast cancer patients who
have died from a cancer registry, and compare them with a similar group of patients with
breast cancer who are still living. This is diagrammed below:

Note that in a case-control study there are two major potential biases which
don't exist to the same extent in a cohort study. First, you have to pick controls. If the
controls are different then the cases (i.e. older, larger, different lifestyle, different
habits), that introduces bias. Second, you are looking backwards in time (retrospective
design) to determine prognostic factors. While in some cases you may have good records and
little bias (for example, descriptions of the surgery performed and the amount of
radiation given), other variables may be subject to significant "recall bias".
For example, knowing that breast cancer is now thought to be related to fatty food and
alcohol could bias womens' recall of their diet. Another important bias should be
obvious from the diagram above: you will have to rely on the families, friends, and
caregivers of women who have died to tell you that patient's diet and habits.