In this section we present the governing equations of elasticity in small and large deformation. In small strain (linear elasticity), the boundary-value problem (Strong form) for constitutive equation (42) may be stated as follows: Given body force \(\rho \bm g\), Dirichlet boundary \(\bar{\bm u}\) and applied traction \(\bar{\bm t}\), find the displacement variable \(\bm u \in \mathcal{V}\) (here \(\mathcal{V} = H^1(\Omega)\) ), such that:

(92)\[ \begin{aligned} -\nabla \cdot \bm \sigma - \rho \bm g &= 0, \qquad \text{in $\Omega$} \\ \bm{u} &= \bar{\bm u}, \quad \text{on $\partial \Omega^{D}$} \\ \bm \sigma \cdot \bm n &= \bar{\bm t}. \qquad \text{on $\partial \Omega^{N}$} \end{aligned} \]

with \(\bm n\) be the unit normal on the boundary and its weak formulation as:

(93)\[ \int_{\Omega}{ \nabla \bm{v} \tcolon \bm{\sigma}} \, dv - \int_{\partial \Omega}{\bm{v} \cdot \bar{\bm t}} \, da - \int_{\Omega}{\bm{v} \cdot \rho \bm{g}} \, dv = 0, \quad \forall \bm v \in \mathcal V. \]

Hyperelasticity, initial configuration

In the total Lagrangian approach for the Neo-Hookean hyperelasticity problem, the discrete equations are formulated with respect to the initial configuration. In this formulation, we solve for displacement \(\bm{u} \left( \bm{X} \right)\) in the initial frame \(\bm{X}\). The notation for elasticity at finite strain is inspired by [Hol00] to distinguish between the current and initial configurations. As explained in the Continuum Mechanics section, we denote by capital letters the initial frame and by small letters the current one.

We can state the hyperelastic strong form as follow: Given body force \(\rho_0 \bm g\), Dirichlet boundary \(\bar{\bm u}\) and applied traction \(\bar{\bm t}\), find the displacement variable \(\bm u \in \mathcal{V}\) such that:

(94)\[ \begin{aligned} -\nabla_X \cdot \bm P - \rho_0 \bm g &= 0. \qquad \text{in $\Omega_0$} \\ \bm{u} &= \bar{\bm u}, \quad \text{on $\partial \Omega^{D}_0$} \\ \bm P \cdot \bm N &= \bar{\bm t}. \qquad \text{on $\partial \Omega^{N}_0$} \end{aligned} \]

and its weak formulation as:

(95)\[ R(\bm u) \coloneqq \int_{\Omega_0} \nabla_X \bm v \tcolon \bm P \, dV - \int_{\Omega_0}\bm v \cdot \rho_0 \bm g \,dV - \int_{\partial \Omega^{N}_0} \bm v \cdot \bar{\bm t} \, dA = 0, \qquad \forall \in \bm{v} \in \mathcal{V} \]

where \((.)_0\) indicates the initial configuration, \(\bm P = \bm F \bm S\) is the first Piola-Kirchhoff stress tensor (see (19) and (21)), \(\bm N\) is the unit normal vector, and \(_X\) in \(\nabla_X\) indicates that the gradient is calculated with respect to the initial configuration. This equation contains material/constitutive nonlinearities in defining \(\bm{S}(\bm{E})\), as well as geometric nonlinearities through \(\bm{P} = \bm{F}\, \bm{S}\), \(\bm{E}(\bm{F})\), and the body force \(\bm{g}\), which must be pulled back from the current configuration to the initial configuration.


Linear elasticity and small-strain hyperelasticity can both by obtained from the finite-strain hyperelastic formulation by linearization of geometric and constitutive nonlinearities. The effect of these linearizations is sketched in the diagram below, where \(\bm{\sigma}\) and \(\bm{\varepsilon}\) are stress and strain, respectively, in the small strain regime, while \(\bm{S}\) and \(\bm{E}\) are their finite-strain generalizations (second Piola-Kirchoff tensor and Green-Lagrange strain tensor, respectively) defined in the initial configuration, and \(\mathsf{C}\) is a linearized constitutive model.

(96)\[ \begin{CD} {\overbrace{\bm{S} \left( \bm{E} \right)}^{\text{Finite Strain Hyperelastic}}} @>{\text{constitutive}}>{\text{linearization}}> {\overbrace{\bm{S} = \mathsf{C} \bm{E}}^{\text{St. Venant-Kirchoff}}} \\ @V{\text{geometric}}V{\begin{smallmatrix}\bm{E} \to \bm{\varepsilon} \\ \bm{S} \to \bm{\sigma} \end{smallmatrix}} V @V{\begin{smallmatrix}\bm{E} \to \bm{\varepsilon} \\ \bm{S} \to \bm{\sigma} \end{smallmatrix}}V{\text{geometric}} V \\ {\underbrace{\bm{\sigma} \left( \bm{\varepsilon} \right)}_\text{Small Strain Hyperelastic}} @>{\text{constitutive}}>\text{linearization}> {\underbrace{\bm{\sigma} = \mathsf{C} \bm{\varepsilon}}_\text{Linear Elastic}} \end{CD} \]

Newton linearization

Discretization of (95) produces a finite-dimensional system of nonlinear algebraic equations, which we solve using Newton-Raphson methods. One attractive feature of Galerkin discretization is that we can arrive at the same linear system by discretizing the Newton linearization of the continuous form; that is, discretization and differentiation (Newton linearization) commute. In general for solving a nonlinear equation \(R(\bm u) = 0\), we use the first-order (Taylor’s) expansion

\[ 0 = R(\bm u) + \diff R(\bm u; \diff \bm u) + o(\diff \bm u), \]

where \(\diff\) and \(\diff \bm u\) denote the linearization operator and the increment of the displacement field \(\bm u\). The reminder \(o(\diff \bm u)\) is a small error that tends to zero faster than \(\diff \bm u \to 0\). The second term \(\diff R\) is the linear change in \(R\) due to \(\diff \bm u\) at \(\bm u\). In fact, it is the directional derivative of \(R\) at a given \(\bm u\) (fixed) in the direction of the incremental displacement field.

The linearized form of (95) (assume constant traction and body force) may be stated as: Find \(\diff \bm u \in \mathcal{V}\) such that

(97)\[ \int_{\Omega_0} \nabla_X \bm v \tcolon \diff \bm P \, dV = -R(\bm u), \qquad \forall \in \bm{v} \in \mathcal{V} \]


(98)\[ \diff \bm{P} = \frac{\partial \bm{P}}{\partial \bm{F}} \tcolon \diff \bm{F} = \diff \bm{F}\, \bm{S} + \bm{F} \underbrace{\frac{\partial \bm{S}}{\partial \bm{E}} \tcolon \diff \bm{E}}_{\diff \bm{S}} \]

and the linearization of deformation gradient and Green-Lagrange strain tensors are

(99)\[ \diff \bm{E} = \frac{\partial \bm{E}}{\partial \bm{F}} \tcolon \diff \bm{F} = \frac{1}{2} \left(\diff \bm{F}^T \bm{F} + \bm{F}^T \diff \bm{F} \right), \qquad \diff \bm{F} = \nabla_X\diff \bm{u}. \]

The linearization of the second Piola-Kirchhoff stress tensor, \(\diff \bm{S}\), depends upon the material model and derived for different materials in the following.

Deriving \(\diff\bm{S}\) for Neo-Hookean material

For the Neo-Hookean model (51),

(100)\[ \diff \bm{S} = \frac{\partial \bm{S}}{\partial \bm{E}} \tcolon \diff \bm{E} = \left( \lambda J^2 V^{''} + \lambda J \, V'\right) (\bm{C}^{-1} \tcolon \diff \bm{E}) \bm{C}^{-1} + \left( \lambda J \, V' - \mu \right) \diff \bm{C}^{-1}, \]


\[ \diff \bm{C}^{-1} = \frac{\partial \bm{C}^{-1}}{\partial \bm{E}} \tcolon \diff \bm{E} = -2 \bm{C}^{-1} \diff \bm{E} \, \bm{C}^{-1}, \]

and for \(V(J) = \frac{1}{4} \left( J^2 - 1 - 2 \log J \right)\) given in strain energy (50), the derivatives are

\[ V'(J) = \frac{1}{2J} \left( J^2 - 1 \right), \, \, V^{''}(J) = \frac{1}{2 J^2} \left( J^2 + 1 \right). \]

The quantity \({\partial \bm{S}} / {\partial \bm{E}}\) is known as the incremental elasticity tensor, and is analogous to the linear elasticity tensor \(\mathsf C\).

\(\diff \bm{S}\) in index notation

It is sometimes useful to express (100) in index notation,

(101)\[ \begin{aligned} \diff \bm{S}_{IJ} &= \frac{\partial \bm{S}_{IJ}}{\partial \bm{E}_{KL}} \diff \bm{E}_{KL} \\ &= \lambda J^2 \left( \bm{C}^{-1}_{KL} \diff \bm{E}_{KL} \right) \bm{C}^{-1}_{IJ} + 2 \left( \mu - \frac{\lambda}{2} \left(J^2 - 1 \right) \right) \bm{C}^{-1}_{IK} \diff \bm{E}_{KL} \bm{C}^{-1}_{LJ} \\ &= \underbrace{\left( \lambda J^2 \bm{C}^{-1}_{IJ} \bm{C}^{-1}_{KL} + 2 \left( \mu - \frac{\lambda}{2} \left(J^2 - 1 \right) \right) \bm{C}^{-1}_{IK} \bm{C}^{-1}_{JL} \right)}_{\mathsf C_{IJKL}} \diff \bm{E}_{KL} , \end{aligned} \]

where we have identified the effective elasticity tensor \(\mathsf C = \mathsf C_{IJKL}\). It is generally not desirable to store \(\mathsf C\), but rather to use the earlier expressions so that only \(3 \times 3\) tensors (most of which are symmetric) must be manipulated. That is, given the linearization point \(\bm F\) and solution increment \(\diff \bm F = \nabla_X (\diff \bm u)\) (which we are solving for in the Newton step), we compute \(\diff \bm P\) via

  1. recover \(\bm C^{-1}\) and \(J-1\) (either stored at quadrature points or recomputed),

  2. proceed with \(3\times 3\) matrix products as in (100) or the second line of (101) to compute \(\diff \bm S\) while avoiding computation or storage of higher order tensors, and

  3. conclude by (98), where \(\bm S\) is either stored or recomputed from its definition exactly as in the nonlinear residual evaluation.

Deriving \(\diff\bm{S}\) for Mooney-Rivlin material

Similar to the linearization of the constitutive relationship for Neo-Hookean materials (100), we differentiate (69) using variational notation,

(102)\[ \begin{aligned} \diff \bm{S} &= \left( \lambda J^2 V^{''} + \lambda J \, V'\right) (\bm{C}^{-1} \tcolon \diff \bm{E}) \bm{C}^{-1} \\ &\quad + \left( \lambda J \, V' - (\mu_1 + 2 \mu_2) \right) \diff \bm{C}^{-1} \\ &\quad + 2 \mu_2 \left( \trace \left( \diff \bm{E} \right) \bm{I} - \diff \bm{E} \right) . \end{aligned} \]

Note that this agrees with the linearization of the Neo-Hookean constitutive relationship {eq}eq-neo-hookean-incremental-stress if \(\mu_1 = \mu, \mu_2 = 0\). Moving from Neo-Hookean to Mooney-Rivlin modifies the second term and adds the third.

Cancellation vs symmetry

Some cancellation is possible (at the expense of symmetry) if we substitute (100) into (98),

(103)\[ \begin{aligned} \diff \bm P &= \diff \bm F\, \bm S + \lambda J^2 \left(\bm C^{-1} : \diff \bm E \right) \bm F^{-T} + 2\left(\mu - \frac{\lambda}{2} \left(J^2 - 1 \right) \right) \bm F^{-T} \diff\bm E \, \bm C^{-1} \\ &= \diff \bm F\, \bm S + \lambda J^2 \left(\bm F^{-T} : \diff \bm F \right) \bm F^{-T} + \left(\mu - \frac{\lambda}{2} \left(J^2 - 1 \right) \right) \bm F^{-T} \left( \bm F^T \diff \bm F + \diff \bm F^T \bm F \right) \bm C^{-1} \\ &= \diff \bm F\, \bm S + \lambda J^2 \left( \bm F^{-T} : \diff \bm F \right) \bm F^{-T} + \left(\mu - \frac{\lambda}{2} \left(J^2 - 1 \right) \right) \Big( \diff \bm F\, \bm C^{-1} + \bm F^{-T} \diff \bm F^T \bm F^{-T} \Big), \end{aligned} \]

where we have exploited \(\bm F \bm C^{-1} = \bm F^{-T}\) and

\[ \begin{aligned} \bm C^{-1} \tcolon \diff \bm E = \bm C_{IJ}^{-1} \diff \bm E_{IJ} &= \frac 1 2 \bm F_{Ik}^{-1} \bm F_{Jk}^{-1} (\bm F_{\ell I} \diff \bm F_{\ell J} + \diff \bm F_{\ell I} \bm F_{\ell J}) \\ &= \frac 1 2 \Big( \delta_{\ell k} \bm F_{Jk}^{-1} \diff \bm F_{\ell J} + \delta_{\ell k} \bm F_{Ik}^{-1} \diff \bm F_{\ell I} \Big) \\ &= \bm F_{Ik}^{-1} \diff \bm F_{kI} = \bm F^{-T} \tcolon \diff \bm F. \end{aligned} \]

We prefer to compute with (100) because (103) is more expensive, requiring access to (non-symmetric) \(\bm F^{-1}\) in addition to (symmetric) \(\bm C^{-1} = \bm F^{-1} \bm F^{-T}\), having fewer symmetries to exploit in contractions, and being less numerically stable.

Hyperelasticity, current configuration

In the preceding discussion, all equations have been formulated in the initial configuration. This may feel convenient in that the computational domain is clearly independent of the solution, but there are some advantages to defining the equations in the current configuration.

  1. Body forces (such as gravity), traction, and contact are more easily defined in the current configuration.

  2. Mesh quality in the initial configuration can be very bad for large deformation.

  3. The required storage and numerical representation can be smaller in the current configuration.

Most of the benefit in case 3 can be attained solely by moving the Jacobian representation to the current configuration [DPA+20], though residual evaluation may also be slightly faster in current configuration. There are multiple commuting paths from the nonlinear weak form in initial configuration (95) to the Jacobian weak form in current configuration (108). One may push forward to the current configuration and then linearize or linearize in initial configuration and then push forward, as summarized below.

(104)\[ \begin{CD} {\overbrace{\nabla_X \bm{v} \tcolon \bm{FS}}^{\text{Initial Residual}}} @>{\text{push forward}}>{}> {\overbrace{\nabla_x \bm{v} \tcolon \bm{\tau}}^{\text{Current Residual}}} \\ @V{\text{linearize}}V{\begin{smallmatrix} \diff \bm{F} = \nabla_X\diff \bm{u} \\ \diff \bm{S} \left( \diff \bm{E} \right) \end{smallmatrix}}V @V{\begin{smallmatrix} \diff\nabla_x\bm{v} = -\nabla_x\bm{v} \nabla_x \diff \bm{u} \\ \diff \bm{\tau} \left( \diff \bm{\epsilon} \right) \end{smallmatrix}}V{\text{linearize}}V \\ {\underbrace{\nabla_X\bm{v}\tcolon \left( \diff \bm{F}\bm{S} + \bm{F}\diff \bm{S} \right)}_\text{Initial Jacobian}} @>{\text{push forward}}>{}> {\underbrace{\nabla_x\bm{v}\tcolon \left( \diff \bm{\tau} -\bm{\tau} \left( \nabla_x \diff \bm{u} \right)^T \right)}_\text{Current Jacobian}} \end{CD} \]

We will follow both paths for consistency and because both intermediate representations may be useful for implementation.

Push forward, then linearize

The first term of (95) can be rewritten in terms of the symmetric Kirchhoff stress tensor \(\bm{\tau} = J \bm{\sigma} = \bm{P} \bm{F}^T = \bm{F} \bm{S} \bm{F}^T\) as

\[ \nabla_X \bm{v} \tcolon \bm{P} = \nabla_X \bm{v} \tcolon \bm{\tau} \bm{F}^{-T} = \nabla_X \bm{v} \bm{F}^{-1} \tcolon \bm{\tau} = \nabla_x \bm{v} \tcolon \bm{\tau} \]

therefore, the weak form in terms of \(\bm{\tau}\) and \(\nabla_x\) with integral over \(\Omega_0\) is

(105)\[ r(\bm u) \coloneqq \int_{\Omega_0}{\nabla_x \bm{v} \tcolon \bm{\tau}} \, dV - \int_{\Omega_0}{\bm{v} \cdot \rho_0 \bm{g}} \, dV - \int_{\partial \Omega_0}{\bm{v} \cdot \bar{\bm t}} \, dA = 0, \quad \forall \bm{v} \in \mathcal{V}. \]

Linearize in current configuration

To derive a Newton linearization of (105), first we define

(106)\[ \nabla_x \diff \bm{u} = \nabla_X \diff \bm{u} \ \bm{F}^{-1} = \diff \bm{F} \bm{F}^{-1} \]

Then by expanding the directional derivative of \(\nabla_x \bm{v} \tcolon \bm{\tau}\), we arrive at

(107)\[ \diff \ \left(\nabla_x \bm{v} \tcolon \bm{\tau} \right) = \diff \ \left( \nabla_x \bm{v} \right) \tcolon \bm{\tau} + \nabla_x \bm{v} \tcolon \diff \bm{\tau} . \]

The first term of (107) can be written as

\[ \begin{aligned} \diff \ (\nabla_x \bm{v})\tcolon \bm{\tau} &= \diff \ \left( \nabla_X \bm{v} \bm{F}^{-1} \right) \tcolon \bm{\tau} = \left( \underbrace{\nabla_X \left(\diff \bm{v}\right)}_{0}\bm{F}^{-1} + \nabla_X \bm{v}\diff \bm{F}^{-1} \right) \tcolon \bm{\tau}\\ &= \left( -\nabla_X \bm{v} \bm{F}^{-1}\diff \bm{F}\bm{F}^{-1} \right) \tcolon \bm{\tau} = \left( -\nabla_x \bm{v} \diff \bm{F}\bm{F}^{-1} \right) \tcolon \bm{\tau}\\ &= \left( -\nabla_x \bm{v} \nabla_x \diff \bm{u} \right) \tcolon \bm{\tau} = -\nabla_x \bm{v}\tcolon\bm{\tau} \left( \nabla_x \diff \bm{u} \right)^T \,, \end{aligned} \]

where we have used \(\diff \bm{F}^{-1}=-\bm{F}^{-1} \diff \bm{F} \bm{F}^{-1}\) and (106). Using this and (107) in (105) yields the Jacobian form in the current configuration: find \(\diff \bm{u} \in \mathcal{V}\) such that

(108)\[ \int_{\Omega_0} \nabla_x \bm{v} \tcolon \left( \diff \bm{\tau} - \bm{\tau} \left( \nabla_x \diff \bm{u} \right)^T \right) = -r(\bm u). \]
Deriving \(\diff\bm\tau\) for Neo-Hookean material

To derive a useful expression of \(\diff\bm\tau\) for Neo-Hookean materials, we will use the representations

\[ \begin{aligned} \diff \bm{b} &= \diff \bm{F} \bm{F}^T + \bm{F} \diff \bm{F}^T \\ &= \nabla_x \diff \bm{u} \ \bm{b} + \bm{b} \ (\nabla_x \diff \bm{u})^T \\ &= (\nabla_x \diff\bm u)(\bm b - \bm{I}) + (\bm b - \bm{I}) (\nabla_x \diff\bm u)^T + 2 \diff\bm\epsilon \end{aligned} \]


(109)\[ \diff\bm\epsilon \equiv \frac{1}{2}\Big(\nabla_x \diff\bm{u} + (\nabla_x \diff\bm{u})^T \Big) . \]


\[ \begin{aligned} \diff \, (J^2 -1) &= 2 J \frac{\partial J}{\partial b} \tcolon \diff \bm{b} = J^2 \bm{b}^{-1}\tcolon \diff \bm{b}\\ &= J^2 \bm b^{-1} \tcolon \Big(\nabla_x \diff\bm u \ \bm b + \bm b (\nabla_x \diff\bm u)^T \Big) \\ &= 2 J^2 \trace (\nabla_x \diff\bm u) \\ &= 2 J^2 \trace \diff\bm\epsilon . \end{aligned} \]

Substituting into (57) gives

(110)\[ \begin{aligned} \diff \bm{\tau} &= \mu \diff \bm{b} + \lambda J^2 \trace (\diff\bm\epsilon) \bm{I} \\ &= \underbrace{2 \mu \diff\bm\epsilon + \lambda J^2 \trace \left( \diff\bm\epsilon \right) \bm{I} - \lambda \left( J^2 -1 \right) \diff\bm\epsilon}_{\bm F \diff\bm S \bm F^T} \\ &\quad + \left( \nabla_x \diff\bm u \right) \underbrace{\Big( \mu \left( \bm b - \bm{I} \right) + \frac{\lambda}{2} \left( J^2 -1 \right) \bm{I} \Big)}_{\bm\tau} \\ &\quad + \underbrace{\Big( \mu \left( \bm b - \bm{I} \right) + \frac{\lambda}{2} \left( J^2 -1 \right) \bm{I} \Big)}_{\bm\tau} \left( \nabla_x \diff\bm u \right)^T , \end{aligned} \]

where the final expression has been identified according to

(111)\[ \diff\bm\tau = \diff\ \left( \bm F \bm S \bm F^T \right) = \left( \nabla_x \diff\bm u \right) \bm\tau + \bm F \diff\bm S \bm F^T + \bm\tau\left( \nabla_x \diff\bm u \right)^T. \]

Collecting terms, we may thus opt to use either of the two forms

(112)\[ \begin{aligned} \diff \bm{\tau} -\bm{\tau}(\nabla_x \diff\bm{u})^T &= (\nabla_x \diff\bm u)\bm\tau + \bm F \diff\bm S \bm F^T \\ &= (\nabla_x \diff\bm u)\bm\tau + \lambda J^2 \trace(\diff\bm\epsilon) \bm{I} + 2 \left( \mu - \frac{\lambda}{2} \left( J^2 -1\right) \right) \diff\bm\epsilon, \end{aligned} \]

with the last line showing the especially compact representation available for Neo-Hookean materials.

Deriving \(\diff\bm\tau\) for Mooney-Rivlin material

To derive a useful expression of \(\diff\bm\tau\) for Mooney-Rivlin materials, we will use the representations of \(\diff \bm{b}, \diff (J^2 -1)\) as derived in previous section and

\[ \begin{aligned} \diff \mathbb{I_1} &= \frac{\partial \mathbb{I_1}}{\partial \bm{b}}\tcolon \diff \bm{b} = \bm{I} \tcolon \diff \bm{b} = \trace (\diff \bm{b})\\ \end{aligned} \]

Substituting into (71) gives

(113)\[ \begin{aligned} \diff \bm{\tau} &= \mu_1 \diff \bm{b} + \lambda J^2 \trace (\diff\bm\epsilon) \bm{I} + \mu_2 \left(\trace (\diff \bm{b}) \bm{b} + \mathbb{I_1}\diff \bm{b} - \bm{b}\diff \bm{b} - \diff \bm{b} \bm{b}\right)\\ &=2 \mu_1 \diff\bm\epsilon + \lambda J^2 \trace (\diff\bm\epsilon) \bm{I} - \lambda \left( J^2 -1 \right) \diff\bm\epsilon + \left(\nabla_x \diff\bm u \right) \Big( \mu_1 (\bm b - \bm{I}) +\frac{\lambda}{2} \left( J^2 -1 \right) \bm{I} \Big)\\ &\quad + \Big( \mu_1 (\bm b - \bm{I}) + \frac{\lambda}{2} \left( J^2 -1 \right) \bm{I} \Big) (\nabla_x \diff\bm u)^T + \mu_2 \trace (\diff \bm{b}) \bm{b} \\ &\quad + \mu_2\mathbb{I_1} \left(\nabla_x \diff\bm u \right) \bm{b} + \mu_2\mathbb{I_1} \bm{b} \left(\nabla_x \diff\bm u \right)^T - \mu_2 \left(\nabla_x \diff\bm u \right) \bm{b}^2 - \mu_2 \bm{b} \left(\nabla_x \diff\bm u \right)^T \bm{b} \\ &\quad - \mu_2 \bm{b} \left(\nabla_x \diff\bm u \right) \bm{b} - \mu_2 \bm{b}^2 \left(\nabla_x \diff\bm u \right)^T + 4\mu_2 \diff\bm\epsilon - 4\mu_2 \diff\bm\epsilon \\ &= \underbrace{\lambda J^2 \trace(\diff\bm \epsilon) \bm{I} + 2 \left( \mu_1 + 2\mu_2 - \frac{\lambda}{2} \left( J^2 -1 \right) \right) \diff \bm{\epsilon} + \mu_2 \trace \left( \diff \bm{b} \right) \bm{b} - 2\mu_2 \bm{b} \diff \bm{\epsilon} \bm{b}}_{\bm F \diff\bm S \bm F^T} \\ &\quad + (\nabla_x \diff\bm u)\underbrace{\Big( \frac{\lambda}{2} \left( J^2 -1 \right) \bm{I} + \mu_1 \left( \bm{b} - \bm{I} \right) + \mu_2 \left( \mathbb{I_1} \bm{b} - 2 \bm{I} - \bm{b}^2 \right) \Big)}_{\bm\tau} \\ &\quad + \underbrace{\Big( \frac{\lambda}{2} \left( J^2 -1 \right) \bm{I} + \mu_1 \left( \bm{b} - \bm{I} \right) + \mu_2 \left( \mathbb{I_1} \bm{b} - 2 \bm{I} - \bm{b}^2 \right) \Big)}_{\bm\tau} (\nabla_x \diff\bm u)^T , \end{aligned} \]

where the final expression has been identified according to

\[ \diff\bm\tau = \diff\ (\bm F \bm S \bm F^T) = (\nabla_x \diff\bm u) \bm\tau + \bm F \diff\bm S \bm F^T + \bm\tau(\nabla_x \diff\bm u)^T. \]

Collecting terms, we may thus opt to use either of the two forms

(114)\[ \begin{aligned} \diff \bm{\tau} -\bm{\tau}(\nabla_x \diff\bm{u})^T &= (\nabla_x \diff\bm u)\bm\tau + \bm F \diff\bm S \bm F^T \\ &= (\nabla_x \diff\bm u)\bm\tau + \lambda J^2 \trace(\diff\bm \epsilon) \bm{I} + 2 \left( \mu_1 + 2\mu_2 - \frac{\lambda}{2} \left( J^2 -1 \right) \right) \diff \bm{\epsilon} \\ &\quad + \mu_2 \trace \left( \diff \bm{b} \right) \bm{b} - 2\mu_2 \bm{b} \diff \bm{\epsilon} \bm{b}.\\ \end{aligned} \]

with the last line showing the especially compact representation available for Mooney-Rivlin materials.

Linearize, then push forward

We can move the derivatives to the current configuration via

\[ \nabla_X \bm{v} \tcolon \diff \bm{P} = \left( \nabla_X \bm{v} \right) \bm{F}^{-1} \tcolon \diff \bm{P} \bm{F}^T = \nabla_x \bm{v} \tcolon \diff \bm{P} \bm{F}^T \]

and expand

(115)\[ \begin{aligned} \diff \bm{P} \bm{F}^T &= \diff \bm{F} \bm{S} \bm{F}^T + \bm{F} \diff \bm{S} \bm{F}^T \\ &= \underbrace{\diff \bm{F} \bm{F}^{-1}}_{\nabla_x \diff \bm{u}} \underbrace{\bm{F} \bm{S} \bm{F}^T}_{\bm{\tau}} + \bm{F} \diff \bm{S} \bm{F}^T . \end{aligned} \]
Representation of \(\bm F \diff\bm S \bm F^T\) for Neo-Hookean materials

Now we push (100) forward via

\[ \begin{aligned} \bm F \diff\bm S \bm F^T &= \left( \lambda J^2 V^{''} + \lambda J \, V'\right) (\bm{C}^{-1} \tcolon \diff \bm{E}) \bm F \bm C^{-1} \bm F^T + \left( \lambda J \, V' - \mu \right) \bm F \diff \bm C^{-1} \, \bm F^T \\ &= \left( \lambda J^2 V^{''} + \lambda J \, V'\right) \left( \bm C^{-1} \tcolon \diff\bm E \right) \bm{I} - 2 \left( \lambda J \, V' - \mu \right) \bm F^{-T} \diff\bm E \, \bm F^{-1} \\ &= \left( \lambda J^2 V^{''} + \lambda J \, V'\right) \trace(\nabla_x \diff\bm u) \bm{I} - 2 \left( \lambda J \, V' - \mu \right) \diff\bm \epsilon \end{aligned} \]

where we have used

(116)\[ \begin{aligned} \bm C^{-1} \tcolon \diff\bm E &= \bm F^{-1} \bm F^{-T} \tcolon \bm F^T \diff\bm F \\ &= \trace(\bm F^{-1} \bm F^{-T} \bm F^T \diff \bm F) \\ &= \trace(\bm F^{-1} \diff\bm F) \\ &= \trace(\diff \bm F \bm F^{-1}) \\ &= \trace(\nabla_x \diff\bm u) \end{aligned} \]


(117)\[ \begin{aligned} \bm F^{-T} \diff\bm E \, \bm F^{-1} &= \frac 1 2 \bm F^{-T} \left(\bm F^T \diff\bm F + \diff\bm F^T \bm F\right) \bm F^{-1} \\ &= \frac 1 2 \left(\diff \bm F \bm F^{-1} + \bm F^{-T} \diff\bm F^T\right) \\ &= \frac 1 2 \Big(\nabla_x \diff\bm u + (\nabla_x\diff\bm u)^T \Big) \equiv \diff\bm\epsilon. \end{aligned} \]

Collecting terms, the weak form of the Newton linearization for Neo-Hookean materials in the current configuration is

(118)\[ \int_{\Omega_0} \nabla_x \bm v \tcolon \left( (\nabla_x \diff\bm u) \bm\tau + \left( \lambda J^2 V^{''} + \lambda J \, V'\right) \trace(\diff\bm\epsilon)\bm{I} - 2 \left( \lambda J \, V' - \mu \right) \diff \bm\epsilon \right) dV = \text{rhs}, \]

which equivalent to Algorithm 2 of [DPA+20] and requires only derivatives with respect to the current configuration. Note that (112) and (118) have recovered the same representation using different algebraic manipulations.

Representation of \(\bm F \diff\bm S \bm F^T\) for Mooney-Rivlin materials

We push forward \(\diff\bm S\) (102) which yields to

\[ \begin{aligned} \bm F \diff\bm S \bm F^T &= \left( \lambda J^2 V^{''} + \lambda J \, V'\right) (\bm{C}^{-1} \tcolon \diff \bm{E}) \bm F \bm C^{-1} \bm F^T + \left( \lambda J \, V' - (\mu_1 + 2\mu_2) \right) \bm F \diff \bm C^{-1} \, \bm F^T \\ &\quad + 2 \mu_2 \left( \trace \left( \diff \bm{E} \right) \bm{b} - \bm{F} \diff \bm{E} \bm{F}^T \right) \\ &= \left( \lambda J^2 V^{''} + \lambda J \, V'\right) \trace(\nabla_x \diff\bm u) \bm{I} - 2 \left( \lambda J \, V' - (\mu_1 + 2\mu_2) \right) \diff\bm \epsilon \\ &\quad + 2 \mu_2 \left( \frac{1}{2} \trace \left( \diff \bm{b} \right) \bm{I} - \bm{b} \diff \bm{\epsilon} \right)\bm{b} \end{aligned} \]

where we have used

(119)\[ \begin{aligned} \trace\left(\diff\bm E \right)&= \trace\left(\bm{F}^T \diff\bm{F} \right) = \trace\left(\bm{F} \diff\bm{F}^T \right) \\ &= \trace\left(\diff\bm F \bm{F}^T \right) = \trace\left(\diff \bm e \right) \\ \end{aligned} \]


(120)\[ \begin{aligned} \bm{F} \diff\bm{E} \, \bm F^{T} &= \frac{1}{2} \bm{F} \left(\bm{F}^T \diff\bm{F} + \diff\bm{F}^T \bm{F} \right) \bm{F}^{T} \\ &= \frac{1}{2} \left(\bm{F}\bm{F}^T \diff\bm{F}\bm{F}^{-1}\bm{F}\bm{F}^{T} + \bm{F}\bm{F}^{T}\bm{F}^{-T}\diff\bm{F}^T \bm{F}\bm{F}^T \right) \\ &= \frac 1 2 \bm{b}\Big(\nabla_x \diff\bm u + (\nabla_x\diff\bm u)^T \Big)\bm{b} =\bm{b} \, \diff\bm\epsilon \, \bm{b}. \end{aligned} \]

Collecting terms, we arrive at

(121)\[ \begin{aligned} \diff \bm{P} \bm{F}^T &= \left(\nabla_x \diff \bm{u} \right) \bm{\tau} + \bm{F} \diff \bm{S} \bm{F}^T \\ &= \left( \nabla_x \diff \bm{u} \right) \bm{\tau} + \left( \lambda J^2 V^{''} + \lambda J \, V'\right) \trace(\nabla_x \diff\bm u) \bm{I} \\ &\quad - 2 \left( \lambda J \, V' - (\mu_1 + 2\mu_2) \right) \diff\bm \epsilon + \mu_2 \trace \left( \diff \bm{b} \right) \bm{b} - 2 \mu_2 \bm{b} \diff \bm{\epsilon} \bm{b} \end{aligned} \]

Note that (114) and (121) have recovered the same representation using different algebraic manipulations.

Pressure boundary condition

One of the important load case is the pressure boundary loading which is caused by liquids or gases on the surface of the solid structure. The pressure boundary load depends upon the current state of the deformation and can be considered as a traction vector \(\bar{\bm{t}} = \bm{\sigma} \cdot {\bm{n}} = -p{\bm{n}}\) per unit current surface, acting in the direction of the outward unit normal \({\bm{n}}\). Therefore, the following surface integral will be add to the weak form (95), or (105)

(122)\[ \int_{\partial \Omega}{\bm{v} \cdot \left( -p {\bm{n}} \right)} \, da, \quad \forall \bm{v} \in \mathcal{V}. \]

where the normal on current surface is computed by

(123)\[ {\bm{n}} = \frac{ \frac{\partial \bm{x}}{\partial \xi_1} \times \frac{\partial \bm{x}}{\partial \xi_2} }{\lvert \frac{\partial \bm{x}}{\partial \xi_1} \times \frac{\partial \bm{x}}{\partial \xi_2} \rvert} \]

in which \( \xi_1, \xi_2 \in [-1,1]^2 \) are reference coordinate system on the face. If we write the surface area in terms of reference coordinate as \(da = \lvert \frac{\partial \bm{x}}{\partial \xi_1} \times \frac{\partial \bm{x}}{\partial \xi_2} \rvert d\xi_1 d\xi_2\), the pressure load (122) can be simplified to

(124)\[ -\int_{[-1,1]^2} \bm{v} \cdot p \left(\frac{\partial \bm{x}}{\partial \xi_1} \times \frac{\partial \bm{x}}{\partial \xi_2} \right) d\xi_1 d\xi_2 \]

To achieve second order convergence in our Newton solver, we will need the linearization of (124) which for the constant pressure is given by

(125)\[ -\int_{[-1,1]^2} \bm{v} \cdot p \left( \frac{\partial \diff \bm{u}}{\partial \xi_1} \times \frac{\partial \bm{x}}{\partial \xi_2} + \frac{\partial \bm{x}}{\partial \xi_1} \times \frac{\partial \diff \bm{u}}{\partial \xi_2}\right) d\xi_1 d\xi_2 . \]

Matrix-free implementation

Ratel solves the momentum balance equations using unstructured high-order finite/spectral element spatial discretizations by matrix-free approach. We present here the notation and mathematical formulation of matrix-free method for a general Dirichlet problem \(\bm{R}(\bm{u}) = 0\): find \(\bm{u} \in \mathcal{V} \subset H^1 \left( \Omega_0 \right)\) such that

(126)\[ \langle\bm{v}, \bm{R}(\bm{u}) \rangle= \int_{\Omega_0}{\bm{v} \cdot \bm{f}_0 (\bm{u}, \nabla \bm{u}) + \nabla \bm{v} \tcolon \bm{f}_1 (\bm{u}, \nabla \bm{u})} \, dV = 0, \quad \forall \bm{v} \in \mathcal{V}, \]

where the operators \(\bm{f}_0\) and \(\bm{f}_1\) contain all possible sources in the problem. In order to solve (126) with Newton-Krylov iterative solvers we need its Jacobian form: find \(\diff \bm{u} \in \mathcal{V} \subset H^1 \left( \Omega_0 \right)\) such that

(127)\[ \langle\bm{v}, \bm{J}(\bm{u}) \diff \bm{u} \rangle= \int_{\Omega_0}{\bm{v} \cdot \diff \bm{f}_0 + \nabla \bm{v} \tcolon \diff \bm{f}_1} \, dV, \]

where the linearization of operators \(\bm{f}_0\) and \(\bm{f}_1\) are

\[ \diff \bm{f}_i = \frac{\partial \bm{f}_i}{\partial \bm{u}} \diff \bm{u} + \frac{\partial \bm{f}_i}{\partial \nabla \bm{u}} \nabla \diff \bm{u}, \quad i=0,1 \]

It should be noted that the gradient in the (126), (127) depends on the configuration system and it could be with respect to initial configuration \(\bm{X}\) i.e., \(\nabla_X \bm{v}\) (Lagrangian approach) or current configuration \(\bm{x}\) i.e., \(\nabla_x \bm{v}\) (Eulerian approach).

Compare with governing equations derived in pervious sections for linear and large deformation, it is easy to verify that

  • For linear elasticity described in (93) we have

\[ \bm f_0 = -\rho \bm g, \quad \bm f_1 = \bm \sigma(\bm u). \]

where the linearization satisfies \(\diff\bm f_{1}(\diff\bm u) = \bm f_1(\diff \bm u)\) due to linearity.

  • For hyperelastic in initial configuration described in (95) and (97) we have

\[ \bm f_0 = -\rho_0 \bm g, \quad \bm f_1 = \bm F \bm S, \quad \diff \bm f_1 = \diff \bm F \bm S + \bm F \diff \bm S. \]
  • For hyperelastic in current configuration derived in (105) and (108)

\[ \bm f_0 = -\rho_0 \bm g, \quad \bm f_1 = \bm \tau, \quad \diff \bm f_1 = \diff \bm{\tau} - \bm{\tau} \left( \nabla_x \diff \bm{u} \right)^T. \]

Static and Quasistatic

The above formulation solves the steady-state equations but with updates to the time parameter in the boundary conditions. Ratel also offers a static solve option via a PETSc SNES solve; this is covered in the static elasticity example. The quasistatic example uses the PETSc Timestepper (TS) object to manage pseudo-timestepping.

Large deformation solid mechanics exhibits both geometric and material nonlinearities, leading to path dependence by which there are be multiple static solutions for a specified set of boundary conditions. To disambiguate the multiple solutions, we solve the hyperelastic problem as a non-autonomous differential algebraic equation of index 1, with boundary conditions/loading a function of time \(t \in [0, 1]\). The current quasistatic example uses applied load (rather than displacement) and use backward Euler from PETSc’s TS with extrapolation-based hot starts disabled for simplicity.

Each pseudo time step requires a nonlinear solve, which is implemented using PETSc’s Scalable Nonlinear Equations Solver (SNES). By default, Ratel uses Newton-CG in which a multigrid V-cycle is used as a preconditioner for conjugate gradients. Additionally, we use a “critical point” line search, which supposes that the residual is the functional gradient of a latent objective function, \(\bm F(\bm u) = \nabla_{\bm u}\Psi(\bm u)\) and uses one step of a secant method to find \(\alpha\) for which \(F(\bm u + \alpha \delta \bm u)^{T} \delta \bm u = 0\) where \(\delta \bm u\) is the search direction found by Newton. This line search is inspired by the strong Wolfe conditions in optimization [NW99], but without explicit evaluation of the objective \(\Psi\), which may not be available or may not exist (e.g., for non-conservative models).

More information on the PETSc TS and SNES objects can be found in the PETSc documentation.


In (126), we represented the static problem i.e. \(\ddot{\bm u} = 0\). However, in the dynamic example acceleration is not zero and we use the PETSc Timestepper (TS) object to manage timestepping. Specifically, Ratel uses the Generalized-Alpha method for second order systems (TSALPHA2) to solve the second order system of equations given by the momentum balance. More information on the PETSc TSALPHA2 time stepper can be found in the PETSc documentation TS.

For elastodynamics we need to add

(128)\[ \int_{\Omega_0} \bm v \cdot \left(\rho_0 \, \ddot{\bm u} \right) \, dV, \]

to the residual equation (126) (or adding \(\rho_0 \ddot{\bm u}\) to \(\bm f_0\) term) and its jacobian

(129)\[ \mathrm{shift_a} \, \int_{\Omega_0} \bm v \cdot \left(\rho_0 \, \diff \bm u \right) \, dV, \]

Note that in general for a time dependent residual function

(130)\[ F (t, \bm u, \dot{\bm u}, \ddot{\bm u}) = 0, \]

we can write the Jacobian as

(131)\[ \diff F = \frac{\partial F}{\partial \bm u} \, \diff \bm u + \mathrm{shift_v} \, \frac{\partial F}{\partial \dot{\bm u}} \, \diff \bm u + \mathrm{shift_a} \, \frac{\partial F}{\partial \ddot{\bm u}} \, \diff \bm u, \]

where \(\mathrm{shift_v}, \mathrm{shift_a}\) is related to the inverse of time step \(\Delta t\) and its square.