Data Analysis 765 Syllabus
Spring 2005
Instructor: Ken Petren *email: ken.petren@uc.edu (6-9719)
Office
Hours: by appointment: 802
Rieveschl
TA: Andrew Osterberg. 6-9715 osterbar@email.uc.edu
Lecture & Lab:
2-5 Tues Thurs. (required)
Computer Lab: Room 622 Rieveschl
GOALS:
To
make you better able to understand, analyze and interpret data.
The
course will emphasize:
- Basic statistical foundations and framework
(biological context)
- Experimental design
- Hypothesis formulation and testing
- Interpretation and presentation of results
My
goal is to provide you with a conceptual framework to facilitate your career
development. I will try to make
you aware of limitations and assumptions you will encounter in analyzing data.
As a graduate student, you must supply the motivation to get the most out of
this course. For instance, you must be prepared to delve further into the
reference texts and other literature
on your own when you are designing experiments and analyzing your own data.
RESOURCES
Computer
programs: SYSTAT 10.2 (PC); Resampling Stats.
Text: Zar, J. H., Biostatistical Analysis (strongly
recommended)
Computer
Lab: You have access any time other classes are not using the lab.
There
will be a fair amount of traffic 9-12 and 1-4 for Ecology 303.
We
will post a schedule soon. See me for other access to SYSTAT
GRADING: your final grade will be based on 100 points.
90-100 points will earn you an A; 80-90 a B and so on, however grades may be scaled up for the entire class
based on group and individual effort as judged by the instructor.
Problem
Sets 75% (see tips below)
Weekly problem sets will be graded for logic,
presentation and interpretation. Points will be deducted for cutting and
pasting raw output, poor graphical presentation, poor grammar, and inaccurate
and unjustified analyses. Please
see the Guide to Problem Sets.
** The last problem set is worth twice as much as
others, and in it
you
will focus on a problem of your
choice.
Participation 25%
-Paper presentation. Objective is not to summarize the
paper,
but to ask insightful DATA ANALYSIS questions
of your comrades.
-Paper discussion. You must read the paper twice, and
you must be
prepared to participate.
-Asking
/ answering questions in Lecture and Lab.
PROBLEM SETS:
Think of problem sets as miniature papers. Each should have a 1-2 sentence statement of the question (In
your own words!), hypotheses, analysis interpretation and broader conclusions.
Answers to problem sets must be typed, combined with graphics into a single MS
WORD file and turned in via email.
You must summarize numeric output from SYSTAT (do not paste SYSTAT
output tables directly). The use of your own summary tables is strongly encouraged where appropriate (e.g. for results from 5-7 similar t-tests). SYSTAT graphs can be
pasted directly into MS Powerpoint for quick manipulation, labeling and
drawing, and these figures can be pasted into MS WORD. See the Guide to
Problem Sets for help.
ASK FOR HELP!: Please do not hesitate to ask for help
if you need it on any aspect of problem sets or understanding the material.
However, for help with analyses,. you MUST at least try to solve your problem
before asking for help! Keep in mind that my door is open for statistical
advice down the road IF you have made an effort to do as much as you can on
your own in this class.
ATTENDANCE is mandatory. You must make arrangements BEFOREHAND if you will miss all
or part of a class, or you will be docked 10 percentage points on your final
grade. Scientific reasons are acceptable for missing class (e.g. going to a
conference, data that must be collected during a specific time). Social events are not acceptable
excuses.
PLAGARISM: Group participation is encouraged, but
everybody should conduct the analysis at their own computer. Plagiarism will
not be tolerated. This means that
WRITTEN REPORTS MUST BE IN YOUR OWN WORDS, and the work of others must be
properly cited. The penalty for plagiarism is a zero on that assignment.
COURSE OUTLINE (DA
765): (Subject to Revision)
Week 1 (3/29)
Philosophy of Data Analysis
Types
of Data / Definitions
Descriptive statistics
Distributions
& Variation
The
normal distribution
The
central limit theorem
Week 2 (4/5)
Probability and chance
Resampling
and the bootstrap.
Random variables
Introduction
to hypothesis testing
Week 3 (4/12)
Hypothesis testing
Area
under the curve, alpha, Type I and II errors,
Comparing the means of two groups
t-distribution
Introduction
to statistical power
Week 4 (4/19)
Data transformations I
Log,
square root, arcsin
Graphical representation of data
Correlation
Assumptions
and uses
Week 5 (4/26)
Regression
Linear,
IV/DV assumptions.
Relationship
to ANOVA and the GLM
Multiple regression
Scaling relationships
Week 6 (5/3)
Categorical variables
Goodness of fit tests
Chi
square, K-S test
Contingency
tables
Nonparametric statistics
strategies of use
Week 7 (5/10)
ANOVA
Introduction
to general linear models
F-statistics
calculations and assumptions
Introduction to experimental design
Independence,
confounding factors and pseudoreplication
Discussion: paper TBA (e.g. Hurlbert pseudoreplication)
Week 8 (5/17)
Higher-order ANOVA,
Repeated measures
fixed
vs. random effects
Discussion: paper TBA (e.g. Petren Case Ecology paper)
Week 9 (5/24)
ANCOVA
Power analysis
Review of basic concepts and examples
Discussion: paper TBA (e.g. Jayne lizard tracks
paper).
Week 10 (5/31)
Designed based on class composition.
- Phylogenetic inference? Character evolution?
Independent
contrasts?
- A brief introduction to basic multivariate analyses
(PCA?)
Discussion: paper TBA (e.g. Hamilton Zuk experiment
paper).