HELP! A Stubborn Vector Identity to Understand

Over the past three years or so, I have been researching the history and implementation of Gibbsian vector analysis with the intent of finding ways to incorporate it more thoroughly and more meaningfully into introductory calculus-based physics (possibly algebra/trig-based physics too). Understanding the usual list of vector identities has been part of this research. One vector identity that has frustrated me involves probably the most innocent looking quantity, the gradient of the dot product of two vectors. I have seen no fewer than five different expressions for the expansion of this seemingly harmless quantity. Here they are.

(1)   \begin{equation*} \nabla\left(\mathbf{A}\bullet\mathbf{B}\right) &= \nabla_A\left(\mathbf{A}\bullet\mathbf{B}\right)+\nabla_B\left(\mathbf{A}\bullet\mathbf{B}\right) \end{equation*}

(2)   \begin{equation*} \nabla\left(\mathbf{A}\bullet\mathbf{B}\right) &= \left(\nabla\mathbf{A}\right)\bullet\mathbf{B}+\left(\nabla\mathbf{B}\right)\bullet\mathbf{A} \end{equation*}

(3)   \begin{equation*} \nabla\left(\mathbf{A}\bullet\mathbf{B}\right) &= J^T_A\mathbf{B}+J^T_B\mathbf{A} \end{equation*}

(4)   \begin{equation*} \nabla\left(\mathbf{A}\bullet\mathbf{B}\right) &= \left(\mathbf{B}\bullet\nabla\right)\mathbf{A}+\mathbf{B}\times\left(\nabla\times\mathbf{A}\right)+\left(\mathbf{A}\bullet\nabla\right)\mathbf{B}+\mathbf{A}\times\left(\nabla\times\mathbf{B}\right) \end{equation*}

(5)   \begin{equation*} \nabla\left(\mathbf{A}\bullet\mathbf{B}\right) &= \mathbf{B}\left(\nabla\bullet\mathbf{A}\right)+\left(\mathbf{B}\times\nabla\right)\times\mathbf{A}+\mathbf{A}\left(\nabla\bullet\mathbf{B}\right)+\left(\mathbf{A}\times\nabla\right)\times\mathbf{B} \end{equation*}

 

Now, equation (1) uses Feynman notation, which endows the nabla operator with the property of obeying the Leibniz rule, or the product rule, for derivatives.  The subscript refers to the vector on which the nabla operator is operating while the other vector is treated as a constant. Note that in chapter 3 of Wilson’s text based on Gibbs’ lecture notes, the subscript denotes which vector is to be held constant, precisely the opposite of the way Feynman presents it. Equation (1) is merely an alternative way of writing the lefthand side and offers nothing new algebraically.

Equation (2) shows nabla operating on each vector in the dot product, which is something many students never see. Like I was told years ago, they are told that one can only take the gradient of a scalar and not a vector, which is patently false. The twist is that, unlike the gradient of a scalar, the gradient of a vector is not a vector; it is a second rank tensor which can be represented by a matrix. This tensor, and its matrix representation, is also called the Jacobian. The dot product of this tensor with a vector gives a vector, so equation (2) is consistent with the fact that the lefthand side must be a vector. I can derive this expression using index notation.

Equation (3) is equation (2) written in (a very slight variation of) matrix notation (the vectors are written as vectors and not as column matrices). I don’t think there is anything more to it.

Equation (4) is the traditional expansion of the lefthand side. It is derived from the BAC-CAB rule, with suitable rearrangements to make sure nabla operates on one vector in each term. Two such applications give equation (4). The “reverse divergence” operators are actually directional derivatives operating on the vectors immediately to the right of each operator. I can derive this expression using index notation.

Equation (5) is shown in problem 1.8.12 on page 48 of Arfken (6th edition). It has the advantage of using the divergences of the two vectors, which I think are easier to understand than the “reverse divergence” operators in equation (4). However, the “reverse curl” operators are completely new to me and I have never seen them in the literature anywhere other than in this problem in Arfken. I think this equation can be derived from equation (4) by appropriately manipulating the various dot and cross products. I have not yet attempted to derive this expression with index notation.

Now, many questions come to mind. I have arranged the first and second terms on the righthand sides of equations (4) and (5) to correspond to the first term on the righthand sides of equations (1), (2), and (3). Similarly, the third and fourth terms on the righthand sides of (4) and (5) correspond to the second term on the righthand sides of equations (1), (2), and (3). By comparison, this must mean that somehow from the gradient (Jacobian) of a vector come both a dot product and a triple cross product. How can this be?

How can the gradient (Jacobian) of a vector be decomposed into a dot product and a triple cross product?

I think I can partly see where the dot product comes from, and it’s basically the notion of a directional derivative. The triple cross products are a complete mystery to me. Is there a geometrical reason for their presence? Would expressing all this in the language of differential forms help? Equations (4) and (5) also seem to imply that the triple cross products are associative, which they generally are not. I think I can justify the steps to get from (4) to (5), so if anyone can help me understand geometrically how the Jacobian can be decomposed into a dot product (directional derivative) and the cross product of a vector and the curl of the other vector, I’d be very grateful.

 

6 thoughts on “HELP! A Stubborn Vector Identity to Understand

  1. Hi, it’s David Metzler. I have thought a bit about the identities, and I’ll look at it some more. One (not very geometric) way to look at it is that you are comparing different tensor contractions of the three-tensor \nabla A \otimes B. The difference of two such tensor contractions induces an antisymmetrization of two indices, which naturally yields something you can write in terms of a cross product. Depending on which two of the three contractions you focus on, you get (4) or (5). This is essentially just a fancy way to describe an index calculation, but it does suggest that these calculations are relatively natural. I do think the fact that you’re working with the whole three-tensor is key to getting the different expressions (4) and (5), so they’re not just coming from the Jacobian (or covariant differential, in more differential-geometric language). I’ll try to write it up nicely.

    I thought a bit about how it would look using forms, but it’s not all antisymmetric, and it necessarily involves the metric tensor, so it’s not clear if there’s a nice presentation of it in that language.

    David M.

      1. I’m still thinking about deeper ways to understand the last two identities, but here are some index calculations. I’m going to be lazy and not put any vector arrows on anything. The basis vectors are {e_i}, and the summation convention is used throughout, even when indices are both down or both up (since the Euclidean metric [dot product] is assumed).

        We have \nabla \times A = \epsilon_{nij} A^j_{,i} e_n and B \times (\nabla \times A) = \epsilon_{lkn} B^k \epsilon_{nij} A^j_{,i} e_l = (A^k_{,l} - A^l_{,k}) B^k e_l, which shows that this contracts B with an antisymmetrization of the Jacobian of A. To see how the epsilons reduce, note that there are two cases, one where i = l, j = k (which gives a plus sign) and one where i = k, j = l (which gives a minus sign). (This is just like the index proof of the vector triple product identity.)

        The novel “reverse curl” operator is defined by B \times \nabla = \epsilon_{ijk} B^j \partial_k e_i, yielding (B \times \nabla) \times A = \epsilon_{ijk} B^j \partial_k \epsilon_{lin} A^n e_l = (B^k A^k_{,l} - B^l A^k_{,k}) e_l. This is not just B contracted with an antisymmetrization of the Jacobian of A. Rather, it’s a contraction of an antisymmetrization of the three-tensor B \otimes \nabla A, where you antisymmetrize the index coming from B with the derivative index.

        Now note that \nabla_{A}(A \cdot B) = A^k_{,l} B^k, (B \cdot \nabla)A = B^k A^l_{,k}, and B(\nabla \cdot A) = B^l A^k_{,k}, which are the three possible contractions of B \otimes \nabla A. Hence the two antisymmetric gadgets above are the differences of the first and second, or first and third, respectively, of these quantities. That proves the fourth and fifth identities. However it’s a very dry proof, for my taste.

        Do you have sources that put these identities to use? I’d be curious to see them in action for some real purpose. It seems that if A,B are either both divergence-free or curl-free (or both!) then they give at least mildly interesting results. It’s not clear to me what the geometric meaning of the vanishing of (B \times \nabla) \times A would be, though.

      2. Goodness, lots of typos. Hard to find them without previewing. Better version:

        I’m still thinking about deeper ways to understand the last two identities, but here are some index calculations. I’m going to be lazy and not put any vector arrows on anything. The basis vectors are {e_i}, and the summation convention is used throughout, even when indices are both down or both up (since the Euclidean metric [dot product] is assumed).

        We have \nabla \times A = \epsilon_{nij} A^j_{,i} e_n and B \times (\nabla \times A) = \epsilon_{lkn} B^k \epsilon_{nij} A^j_{,i} e_l = (A^k_{,l} - A^l_{,k}) B^k e_l, which shows that this contracts B with an antisymmetrization of the Jacobian of A. To see how the epsilons reduce, note that there are two cases, one where i = l, j = k (which gives a plus sign) and one where i = k, j = l (which gives a minus sign). (This is just like the index proof of the vector triple product identity.)

        The novel “reverse curl” operator is defined by B \times \nabla = \epsilon_{ijk} B^j \partial_k e_i, yielding (B \times \nabla) \times A = \epsilon_{ijk} B^j \partial_k \epsilon_{lin} A^n e_l = (B^k A^k_{,l} - B^l A^k_{,k}) e_l. This is not just B contracted with an antisymmetrization of the Jacobian of A. Rather, it’s a contraction of an antisymmetrization of the three-tensor B \otimes \nabla A, where you antisymmetrize the index coming from B with the derivative index.

        Now note that \nabla_{A}(A \cdot B) = A^k_{,l} B^k e_l, (B \cdot \nabla)A = B^k A^l_{,k} e_l, and B(\nabla \cdot A) = B^l A^k_{,k} e_l, which are the three possible contractions of B \otimes \nabla A. Hence the two antisymmetric gadgets above are the differences of the first and second, or first and third, respectively, of these quantities. That proves the fourth and fifth identities. However it’s a very dry proof, I think.

        Do you have sources that put these identities to use? I’d be curious to see them in action for some real purpose. It seems that if A,B are either both divergence-free or curl-free (or both!) then they give at least mildly interesting results. It’s not clear to me what the geometric meaning of the vanishing of (B \times \nabla) \times A would be, though.

    1. David, believe it or not, this vector identity implicitly appears in introductory calculus-based physics in two places. These examples are usually omitted in the intro courses because they appear again in more detail in upper level electromagnetism courses.

      The potential energy of a system consisting of an electric dipole and an external electric field is given by U= -\mathbf{p}\cdot\mathbf{E} where \mathbf{p} is the dipole moment and \mathbf{E} is the electric field. The force can now be calculated as the gradient of this dot product. Various terms can be made to vanish by making assumptions about the dipole moment and electric field. If the dipole moment is constant, it pushes through the derivatives as a constant. If the electric field is static, it has no curl.

      The potential energy of a system of a magnetic dipole in an external magnetic field is given by an analogous expression U= -\mathbf{m}\cdot\mathbf{B} with \mathbf{m} being the magnetic dipole moment and \mathbf{B} being the magnetic field. Again, certain assumptions about the dipole moment and magnetic field (magnetic fields are have no divergence for example) make certain terms vanish.

      In playing around yesterday, I think I was able to show using index notation that the triple cross products (I call them double cross products since only two cross operations are involved) are indeed associative if they contain \nabla but I’m not convinced I’m correct. Please feel free to double check me on this as I certainly may be wrong as I’ve never seen the “reverse curl” operator in the literature and I’m rather stunned that triple (double) cross products with one \nabla may be associative. (I know that if the two outer vectors in a triple (double) cross product are equal e.g. \mathbf{n}\times(\mathbf{A}\times\mathbf{n}) then the result is indeed associative.)

      More later.

Leave a Reply

Your email address will not be published. Required fields are marked *