run basic torch calculation at startup in parallel to reduce the performance impact of first generation