gt; - \delta$ and lt; \delta$, we have that the loss function is continuous and differentiable since it is MSE. When the error is lt; -\delta$ or gt; \delta$, we also have that the loss function is continuous and differentiable since it is a scaled & translated version of MAE, and we know that MAE is differentiable when the error is not 0. And since the two piecewise definitions of the function agree at their boundary, the whole function is continuous. Now to show differentiability, we need to ensure the slops match at the point where the error equals $\delta$. The derivative of the function when $|y-f(x)| \leq \delta$ is $y - f(x)$. The derivative when $|y - f(x)| > \delta$ is $\delta \; \cdot \; \text{sign}(y - f(x))$. Then, when $y-f(x) = \delta$ or $y - f(x) = -\delta$, the slopes match. Thus, the function is differentiable for all $x \in \mathbb{R}$ **Implementation Note:** quick-and-dirty way to implement Huber Loss is just to use MSE and clip the gradients (that is, do not let the gradient exceed some maximum magnitude). --- # References https://en.wikipedia.org/wiki/Huber_loss