Published inTowards Data ScienceMeta Llama 3 Optimized CPU Inference with Hugging Face and PyTorchLearn how to reduce model latency when deploying Meta* Llama 3 on CPUsApr 192Apr 192
Transforming Financial Services with RAG: Personalized Financial AdviceSynthesize the complex web of financial strategies, regulations, and trends with a personalized financial advice RAG-based chatbotApr 5Apr 5
Transforming Manufacturing with RAG: Delivering NextGen Equipment MaintenanceKeep operations online with actionable, relevant, and effective maintenance strategies with retrieval augmented generationApr 3Apr 3
Transforming Retail with RAG: The Future of Personalized ShoppingDelivering dynamic, fresh, and timely recommendations to shoppers with retrieval augmented generationApr 2Apr 2
Published inTowards Data ScienceImproving LLM Inference Latency on CPUs with Model QuantizationDiscover how to significantly improve inference latency on CPUs using quantization techniques for mixed, int8, and int4 precisions.Feb 292Feb 292
Published inAWS TipDistributed Fine-Tuning of Stable Diffusion with CPUs on AWSLearn how to use Hugging Face* Accelerate on Amazon Web Services (AWS)* to fine-tune Stable DiffusionDec 20, 2023Dec 20, 2023
Published inTowards Data ScienceRetrieval Augmented Generation (RAG) Inference Engines with LangChain on CPUsExploring scale, fidelity, and latency in AI applications with RAGDec 5, 20231Dec 5, 20231
A Case for Operational-Centric AIProposing a model for understanding the evolution of AI from the perspective of engineering resource investmentNov 2, 2023Nov 2, 2023
Published inIntel Analytics SoftwareFast Prototyping of Artificial Intelligence ApplicationsUse Intel AI Reference Kits and Open-Source Software for Fast PrototypingOct 23, 2023Oct 23, 2023