General relativity/Contravariant and Covariant Indices

From testwiki
Revision as of 00:17, 8 April 2007 by 190.30.157.82 (talk) (β†’Contravariant and Covariant Vectors)
(diff) ← Older revision | Latest revision (diff) | Newer revision β†’ (diff)
Jump to navigation Jump to search

< General relativity

Rank and Dimension

Now that we have talked about tensors, we need to figure out how to classify them. One important characteristic is the rank of a tensor, which is the number of indices needed to specify the tensor. An ordinary matrix is a rank 2 tensor, a vector is a rank 1 tensor, and a scalar is rank 0. Tensors can, in general, have rank greater than 2, and often do.

Another characteristic of a tensor is the dimension of the tensor, which is the count of each index. For example, if we have a matrix consisting of 3 rows, with 4 elements in each row (columns), then the matrix is a tensor of dimension (3,4), or equivalently, dimension 12.

The important thing about rank and dimension is that they are invariant to changes in the coordinate system. You can change the coordinate system all you want, and the rank and the dimensions don't change. This brings up the important question of how tensors do change when you change the coordinate system. One thing we shall find when we look at the question is that in reality there are two different types of vectors.

Contravariant and Covariant Vectors

Imagine that you are driving a car at 100 kilometers per hour toward the northwest. Lets call this vector v. Suddenly you realize that you are in a meter-ish mood and so we want to figure out how fast you are going using meters instead of kilometers. Quickly changing your coordinate system, you find that you are travelling 100 * 1000 meters per hour toward the northwest. No problem.

Now you are in the rolling countryside, and you notice the temperature changing. We then draw a map of how the temperature changes as we move across the countryside. We then travel along the path of steepest descent. We notice at a given point that the temperature is changing at 10 Celsius degrees per kilometer toward the southwest. Let's call this vector w. Again you go into a meter-ish mood. Doing a quick calculation you figure out that the gradient of the temperature change is downward at 10/1000 Celsius degrees per meter.

Did you notice something interesting?

Even though we are talking about two vectors we are treating them very differently when we change our coordinates. In the first case, the vector reacted to the coordinate change by a multiplication. In the second case, we did a division. The first case we were changing a vector that was distance per something, while in the second case, the vector was something per distance. These are two very different types of vectors.

The mathematical term for the first type of vector is called a contravariant vector. The second type of vector is called a covariant vector. Sometimes a covariant vector is called a one form.

Attempting a fuller explanation
It is easy to see why w is called covariant. Covariant simply means that the characteristic that w measures, change in temperature, increases in magnitude with an increase in displacement along the coordinate system. In other words, the further you travel from a fixed point, the more the temperature changes, or equivalently, change in temperature covaries with change in displacement.
Although it is a bit more difficult to see, v is called contravariant for precisely the opposite reason. Since v represents a velocity, or distance per unit time, we can think of v as the inverse of time per unit distance, meaning the amount of time that passes in traveling a certain fixed amount of distance. Time per unit distance is clearly covariant, because as you travel further and further from a fixed point, more and more time elapses. In other words, time covaries with displacement. Since velocity is the inverse of time per unit distance, than it follows that velocity must be contravariant.
The difference is also evident in the units of measure. The units of measure for v are meters per hour, whereas the units for w are degrees Celsius per meter. The coordinate system is position in space, measured in units of meters. So again, we see that the coordinate system appears in the numerator of v, which suggests that v is contravariant (with inverse time in this case), wherease the coordinate system appears in the denominator of w, which indicates that w is covariant (with change in temperature).
Contravariant vectors describe those quantities where the distance units comes at the numerator (like velocity), whereas covariant are those where the distance unit is at the denominator (like temperature gradient).


These are, of course, just fancy mathematical names. As we can see contravariant vectors and covariant vectors are very different from each other and we want to avoid confusing them with each other. To do this mathematicians have come up with a clever notation. The components of a contravariant vector are represented by superscripts, while the components of a covariant vector are represented by subscripts. So the components of vector v are v1 and v2 while the components of vector w are w1 and w2.

Scale Invariance

Now that we have contravariant vectors and covariant vectors, we can do something very interesting and combine them. We have a contravariant vector that describes the direction and speed at which we are going. We have covariant vector that describes the rate and direction at which the temperature changes. If we combine them using the dot product

f  =  π―𝐰

we get the rate at which the temperature changes, f, as we move in a certain direction, with units of degrees Celsius per second. The interesting thing about the units of f is that they do not include any units of distance, such as meters or kilometers. So now suppose we change the coordinate system from meters to kilometers. How does f change? It doesn't. We call this characteristic scale invariance, and we say that f is a scale invariant quantity. The value of f is invariant with changes in the scale of the coordinate system.

Now so far we have been treating w as if it were just an odd type of vector. But there is a another more powerful way of thinking about w. Look at what we just did. We took v, combined it with w and got something that doesn't change when you change the coordinate system. Now one way of thinking about it is to say that w is a function, that takes v and converts it into a scale invariant value, f.


Vector Spaces and Basis Vectors

This fact, that a covariant vector like w can convert any contravariant vector like v into a scale invariant value like f, is described by saying that w is a linear functional.

Can we be more precise about using the word like? Yes, and it's important. Mathematical operations such as converting one sort of vector into another sort of vector are done on vector spaces. See vector space for a careful definition of vector spaces. Here, loosely speaking, let us say that a vector space is a set of vectors which can be added together and multiplied by numbers and that the result is always another vector in the same vector space.

Let us define V to be the vector space of contravariant vectors like v.

Then, the set of all covariant vectors like w, which convert vectors like v from V into scalars like f, which we can also call the set of all linear functionals w on V, can be given the name V*, which we call the dual space.

V* is also a vector space.

Now we can be more careful about the word like, by saying which spaces each object must be a member of: any vector w in V* (called a covariant vector, or a 1-form) can convert any vector v in V (called a contravariant vector) into a scale invariant value like f. (We have not said what set f is a member of: in practice, we will usually only be interested in f as a member of the set of real numbers.)

Any vector space has a set of basis vectors. Components of contravariant vectors are written with superscript ("upper") indices, but the basis vectors are written with subscript ("lower") indices. If the set {𝐞α} is a basis for V, then 𝐯V is written as the linear combination 𝐯=vμ𝐞μ. (We are using Einstein summation notation, detailed in the next section; this is shorthand for μvμ𝐞μ.)

Before moving on to covariant vectors, we must define the notion of a dual basis. Remember that elements of V* are linear functionals on V. So we can "apply" covariant vectors to contravariant vectors to get a scalar. For example, if σV* and 𝐯V, then σ(𝐯) returns a scalar. Now, the dual basis is defined as follows: if {𝐞α} is a basis for V, then the dual basis is a basis {ωα} for V* which satisfies ωμ(𝐞ν)=δνμ (where δνμ is the Kronecker delta) for every μ and ν. Note that the dual basis for the canonical basis is usually written as {𝐝xα}, for reasons we will not go into in this section.

Now, the components of covariant vectors are written with subscript ("lower") indices. As {ωα} is a basis for V*, we can write a covariant vector σ as σ=σμωμ.

We can now evaluate any functional (covariant vector) applied to any vector (contravariant vector). If σV* and 𝐯V, then by linearity σ(𝐯)=σαωα(vβ𝐞β)=σαvβωα(𝐞β)=σαvβδβα=σαvα. Finally, if we define 𝐞α(ωβ)=δαβ, we see that 𝐯(ω)=ω(𝐯).