1. Since gradient may be the computationally most intensive operation, for fair comparison, we compare SGD to SVRG based on the number of gradient computations. 2. For simplicity we will only consider the case that each... 3. When the number of components n is very large, each iterat ...