7dbe
ÐÂÖÇÔª±¨µÀ ±à¼£ºÏ¬Å£ ºÃÀ§ ¡¾ÐÂÖÇÔªµ¼¶Á¡¿LLMµÄ¹æÄ£±¬Õ¨Ê½ÔöÌí£¬¹Å°åÁ¿»¯ÊÖÒÕËäÄÜѹËõÄ£×Ó£¬È´ÒÔÎþÉü¾«¶ÈΪ¼ÛÇ®¡£À³Ë¹´óѧÍŶӵÄ×îÐÂÑо¿DFloat11Í»ÆÆÕâÒ»½©¾Ö£ºËü½«Ä£×ÓѹËõ30%ÇÒÊä³öÓëÔʼģ×ÓÖðλһÖ£¡¸ü¾ªÑÞµÄÊÇ£¬Í¨¹ýÕë¶ÔGPUµÄ¶¨ÖÆ»¯½âѹËõÄںˣ¬DFloat11Ê¹ÍÆÀíÍÌÍÂÁ¿ÌáÉý×î¸ß38.8±¶¡£ ÈËÈ˶¼ÏëÓÐÒ»¸ö×Ô¼ºµÄDeepSeek£¬µ«²¢²»ÊÇÈËÈ˶¼ÓС¸Ò»´ò¡¹96GBÏÔ´æµÄH20¡£ ËäÈ»Á¿»¯¿ÉÒÔ¼«´óµØ½µµÍÄ£×Ó¹ØÓÚÏÔ´æµÄÐèÇ󣬵«ËüʵÖÊÉÏÊÇÒ»ÖÖÓÐËðѹËõÊÖÒÕ¡£ »»¾ä»°Ëµ¾ÍÊÇ£¬Á¿»¯Ä£×ÓµÄÊä³öÂþÑܲ»¿É×èÖ¹µØ»áÊܵ½Ó°Ï죬½ø¶ø½µµÍLLMµÄ׼ȷÐԺͿɿ¿ÐÔ¡£ Ϊ´Ë£¬À´×ÔÀ³Ë¹´óѧµÈ»ú¹¹µÄÑо¿Ö°Ô±Ìá³öÁËÒ»ÖÖȫеÄÎÞËðѹËõ¿ò¼Ü¡ª¡ª¶¯Ì¬³¤¶È¸¡µãÊý£¨DFloat11£©£¬ËüÄܹ»½«LLMµÄ¾ÞϸïÔÌ30%£¬Í¬Ê±È·±£Êä³öЧ¹ûÓëÔʼģ×ÓÖðλÏàͬ¡£ ÂÛÎĵص㣺https://arxiv.org/abs/2504.11651 ΪÁËÖ§³ÖʹÓö¯Ì¬³¤¶È±àÂë¾ÙÐиßÐ§ÍÆÀí£¬ÍŶÓרÃÅ¿ª·¢ÁËÒ»¸ö¶¨ÖƵÄGPUÄںˣ¬ÓÃÓÚʵÏÖ¿ìËÙµÄÔÚÏß½âѹËõ£º 1. ½«ÄÚ´æ÷缯Ð͵IJéÕÒ±í (LUT) ÆÊÎöΪ¸ü½ô´ÕµÄLUT£¬Ê¹ÆäÄܹ»ÍêÈ«·ÅÈëGPUµÄSRAMÖУ» 2. Ò»¸öË«½×¶ÎÄںˣ¬Ê¹ÓÃÇáÁ¿¼¶µÄ¸¨Öú±äÁ¿À´Ðµ÷Ï̵߳ĶÁдλÖã» 3. Transformer Block¼¶µÄ½âѹËõ£¬´Ó¶ø×î´óÏ޶ȵؽµµÍÑÓ³Ù¡£ ÔÚLlama-3.1¡¢Qwen-2.5¡¢Gemma-3µÈSOTAÄ£×ÓÉϵÄʵÑéÅú×¢£¬ DFloat11³ýÁËÄÜÓÐÓÃѹËõÄ£×ӵľÞϸ֮Í⣬ͬʱ»¹Äܼá³ÖÍêȫһÖµÄÊä³öЧ¹û¡£ Ó뽫ģ×ӵIJ¿·ÖÊý¾ÝÐ¶ÔØµ½CPUµÄ¼Æ»®Ïà±È£¬DFloat11ÔÚTokenÌìÉúʹÃüÖÐʵÏÖÁË1.9µ½38.8±¶µÄÍÌÍÂÁ¿ÌáÉý¡£ ÔÚGPUÏÔ´æÀο¿µÄÇéÐÎÏ£¬DFloat11Äܹ»Ö§³Ö±ÈδѹËõÄ£×Ó³¤5.3µ½13.17±¶µÄÉÏÏÂÎij¤¶È¡£ ÌØÊâÖµµÃÒ»ÌáµÄÊÇ£¬DFloat11ÀֳɵØÊµÏÖÁËLlama-3.1-405B£¨810GB£©ÔÚµ¥½ÚµãÉÏ£¨8¿é80GB GPU£©µÄÎÞËðÍÆÀí¡£ Llama-3.1-405BÓµÓÐ4050ÒÚ²ÎÊý£¬½ÓÄÉ16λBrain Float£¨BFloat16£©ÃûÌã¬ÐèÒªÔ¼810GBÄÚ´æ²Å»ªÊµÏÖÍêÕûÍÆÀí£¬Áè¼ÝÁ˵䷶GPUЧÀÍÆ÷µÄÈÝÁ¿£¨ÈçÅ䱸8¡Á80GB GPUµÄDGX A100/H100£©¡£ ±¾ÎÄÒ»×÷Tianyi Zhang£¬ÊÇÀ³Ë¹´óѧÅÌËã»ú¿ÆÑ§×¨ÒµµÄ²©Ê¿Éú£¬Ö®Ç°ÔÚ»¬Ìú¬´óѧ»ñµÃÅÌËã»ú¿ÆÑ§Ñ§Ê¿Ñ§Î»¡£ ΪʲôҪ¶ÔLLM¾ÙÐÐÎÞËðѹËõ£¿ ÔÚÏÖÔÚµÄÓÐËðÁ¿»¯ÊÖÒÕÖУ¬Ä£×Óͨ³£±»Ñ¹Ëõµ½¸üµÍµÄλ¿í¶È£¨Èç8λ»ò4룩¡£ ËäÈ»²¿·Ö»ù×¼²âÊÔÅú×¢£¬8λÁ¿»¯ÊÇÒ»ÖÖÏà¶Ô¡¸Çå¾²¡¹µÄѹËõ¼Æ»®£¬µ«ÔÚÏÖʵÌåÑéʱÖÕ¾¿²»ÈçÎÞËðµÄÄ£×Ó¡£ ÀýÈ磬LLM ArenaÉϵÄÈ˹¤ÆÀ¹ÀÏÔʾ£¬Llama-3.1-405B-Instruct¼°Æä8λ°æ±¾£¨Llama-3.1-405B-Instruct-FP8£©Ö®¼äµÄÐÔÄܱ£´æÏÔÖøÏ½µ£¬ÓÈÆäÊÇÔÚ±àÂëºÍ³¤ÅÌÎÊʹÃüÖС£ ÀàËÆµÄ£¬½«DeepSeek-R1-Distill-Llama-70B´Ó16λÁ¿»¯µ½8λ»áµ¼ÖÂGPQAÉϵÄÐÔÄÜϽµ23.7%£¨´Ó9.51%½µÖÁ7.25%£©¡£ ±ðµÄ£¬ÍÆÀí×÷ΪÏÖ´úLLMµÄ½¹µãÄÜÁ¦£¬Ëƺõ¶ÔѹËõËðÊ§ÌØÊâÃô¸Ð¡£ һЩ»ù×¼²âÊÔÅú×¢£¬Ê¹ÓÃ8λSmoothQuant£¨ÓÃÓÚÈ¨ÖØ¡¢×¢ÖØÁ¦ºÍKV»º´æ£©Á¿»¯µÄDeepSeek-R1-Distill-Qwen-1.5B£¬»áÔÚAIME¡¢MATH-500¡¢GPQA-DiamondºÍLiveCodeBenchµÈÊý¾Ý¼¯ÉϵÄÍÆÀíÐÔÄÜ£¬Æ½¾ùϽµ9.09%£¨´Ó48.82%½µÖÁ44.29%£©¡£ ÓÐËðѹËõ½µµÍÖÊÁ¿£¬ÎÞËðѹËõȱ·¦Ð§ÂÊ Ïà±È֮ϣ¬ÎÞËðѹËõÊÖÒÕÔÚÓÐÓüõС´ó¹æÄ£LLM¾ÞϸµÄͬʱ£¬Äܹ»±£´æÆä׼ȷµÄÔÊ¼È¨ÖØ£¬È·±£Ä£×ÓµÄÊä³öÂþÑÜÓëδѹËõÌåÏÖ£¨ÀýÈçBFloat16£©µÄÊä³öÂþÑÜÍêȫһÖ¡£ È»¶ø£¬ÏÖÓеÄÎÞËðѹËõÒªÁìÖ÷Òª×ÅÖØÓÚÌá¸ßLLMµÄ´æ´¢Ð§ÂÊ£¬ÀýÈçËõСģ×Ó¼ì²éµã£¬»òÕßÓÅ»¯ÖîÈçFPGAµÈרÓÃÓ²¼þµÄÐÔÄÜ¡£ ÕâЩҪÁìËäÈ»ÓÐÀûÓÚѵÁ·Àú³ÌÖеĸßЧ¼ì²éµã»Ø¹ö£¬»òÕß´ÓHugging FaceµÈÄ£×Ó¿ÍÕ»¼ÓËÙÏÂÔØ£¬µ«ÆäÓÅÊÆÍ¨³£ÄÑÒÔÓÐÓõØÀ©Õ¹µ½»ùÓÚGPUµÄLLMÍÆÀí¡£ ʵÑéÒªÁì ΪÁËÍÆ¶¯LLMÈ¨ÖØµÄÎÞËðѹËõ£¬ÍŶӯÊÎöÁË×îÐÂLLMÈ¨ÖØÖÐBFloat16¸÷¸ö×é³É²¿·Ö£¨·ûºÅ¡¢Ö¸ÊýºÍβÊý£©µÄ¿ÉѹËõÐÔ¡£ ÏêϸÀ´Ëµ£¬ÍŶÓʹÓÃÏãÅ©ìØÀ´Á¿»¯LLMÏßÐÔͶӰ¾ØÕóÖвÎÊýµÄÐÅÏ¢Á¿¡£ÏãÅ©ìØH(¡¤)½ç˵ÈçÏ£º ÆäÖÐXÊÇÒ»¸öÀëÉ¢Ëæ»ú±äÁ¿£¬ÆäËùÓпÉÄÜȡֵµÄÜöÝÍΪ¦Ö£¬p:¦Ö¡ú[0,1]ÌåÏÖÆä¸ÅÂÊÖÊÁ¿º¯Êý¡£ Èçͼ1Ëùʾ£¬·ûºÅºÍβÊý²¿·ÖµÄìØÖµÓëÆä¶ÔÓ¦µÄλ¿í¿¿½ü£¬ËµÃ÷ËüÃǵĿÉѹËõ¿Õ¼ä²»´ó¡£Ïà±È֮ϣ¬Ö¸Êý²¿·ÖµÄìØÖµÏÔ׎ϵͣ¬Ö»ÓÐÔ¼2.6룬µ«Æä·ÖÅɵÄλÊýΪ8룬ÕâÒâζ×ÅÎÞËðѹËõ±£´æºÜ´óµÄʱ»ú¡£ ÎÞËðLLMѹËõ¿ò¼Ü£¬ÊµÏÖ¸ßЧGPUÍÆÀí ΪÏàʶ¾öLLMÈ¨ÖØÔÚBFloat16ÌåÏÖÖб£´æµÄÖØ´óÐÅÏ¢ÈßÓàÎÊÌ⣬ÍŶÓÌá³öÁËÒ»ÖÖʹÓÃìØ±àÂëÀ´¶Ô¸¡µã²ÎÊý¾ÙÐбàÂëµÄÎÞËðѹËõ¿ò¼Ü¡ª¡ªDFloat¡£ Ê×ÏÈ£¬»ùÓÚLLMÏßÐÔͶӰ¾ØÕóÖÐËùÓÐBFloat16È¨ÖØµÄÖ¸ÊýÂþÑܹ¹½¨Ò»¸öHuffmanÊ÷¡£ È»ºó£¬Ê¹Huffman±àÂëѹËõÖ¸Êý²¿·Ö£¬Í¬Ê±±£´æÔʼµÄ·ûºÅλºÍβÊý¡£ Ö¸Êý±»±àÂëºó£¬Ï¸Ãܵشò°üµ½EncodedExponent×Ö½ÚÊý×éÖУ¬¶ø·ûºÅλºÍβÊýÔò¼á³ÖδѹËõ״̬£¬´æ´¢ÔÚÁíÒ»¸öPackedSignMantissa×Ö½ÚÊý×éÖС£ ¶¯Ì¬³¤¶È¸¡µãÊýÃûÌÿÉÒÔ½ô´ÕµØÌåÏÖ¸¡µãÄ£×Ó²ÎÊý ʹÓýô´ÕLUTʵÏÖ¸ßЧ½âÂë ÓÉÓÚHuffman±àÂë¿ÉÒÔͨ¹ý»úÔµ²éÕÒ±í£¨Lookup Table£¬LUT£©µÄÒªÁìÓÐÓõؽâÂ룬ÓÚÊÇÍŶӹ¹½¨ÁËÒ»¸ö¾ÞϸΪ2^LµÄLUT£¬ÆäÖÐLÊÇÂë±¾ÖÐÈκÎHuffman±àÂëµÄ×î´ó볤¶È¡£ ΪÁ˾ÙÐнâÂ룬ÍŶӴӱàÂëµÄλÁ÷ÖжÁÈ¡½ÓÏÂÀ´µÄL룬²¢½«ËüÃÇ×÷ΪLUTµÄË÷ÒýÀ´»ñÈ¡ÏÂÒ»¸ö½âÂëºóµÄ·ûºÅ¡£ ΪÏàʶÂëDFloat11ÃûÌõÄÖ¸Êý£¬ÏÞÖÆÃ¿¸öÄ£×ÓµÄ×î´ó´úÂ볤¶ÈLΪ32λ¡£ ¹ØÓÚÄÇЩL´óÓÚ32µÄÄ£×Ó£¬ÍŶÓͨ¹ý½«×î²»³£¼ûµÄÖ¸ÊýµÄƵÂʽµµÍµ½1²¢ÖØÐ¹¹½¨HuffmanÊ÷À´Ç¿ÖÆÖª×㳤¶ÈÔ¼Êø¡£ ÔÆÔÆ£¬±ã»áÔÚHuffmanÊ÷µÄβ²¿±¬·¢Ò»¸öÔ½·¢Æ½ºâµÄ½á¹¹£¬Îª×îÓÐÊýµÄÖ¸Êý·ÖÅÉÏàͬ³¤¶ÈµÄ´úÂ룬²¢½«×î´ó´úÂ볤¶ÈËõ¼õµ½3λ¡£ È»¶ø£¬µ±L=32ʱ£¬Ö±½ÓʹÓòéÕÒ±í½«ÐèÒª232¡Ö42.9ÒÚ¸öÌõÄ¿£¬Õ⽫ÏûºÄÖØ´óµÄÄÚ´æ¡£ ΪÏàʶ¾öÕâ¸öÎÊÌ⣬ÍŶÓÌá³ö½«Õâ¸öÖØ´óµÄLUTÖ§½â³ÉËĸö»¥²»ÏཻÇÒ½ÚÔ¼ÄÚ´æµÄ²éÕÒ±í¡ª¡ªLUT1¡¢LUT2¡¢LUT3ºÍLUT4¡£ ÕâÑùÒ»À´£¬ÄÚ´æÕ¼ÓþͿÉÒÔÍêÈ«·ÅÔÚGPU SRAMÖУ¬´Ó¶øÊµÏÖ¿ìËÙ»á¼û¡£ Á½½×¶ÎKernelºÍÇáÁ¿¼¶¸¨Öú±äÁ¿ ΪÁËÄܹ»¶ÔDFloat11ÃûÌÃÖоÓÉìØ±àÂëµÄÖ¸Êý¾ÙÐдó¹æÄ£²¢ÐнâÂ룬ÍŶÓΪÿ¸öÏ̷߳ÖÅÉÒ»¶ÎÀο¿³¤¶ÈµÄ¡¢À´×Ô±àÂëÐòÁеÄ×Ö½ÚÀ´¾ÙÐд¦Öóͷ£¡£ È»¶ø£¬ÕâÖÖÒªÌå»á´øÀ´Á½¸öÖ÷ÒªµÄÌôÕ½£º 1. ÓÉÓÚHuffman±àÂëµÄλ¿íÊǿɱäµÄ£¬²¢ÇÒ±àÂëºóµÄÊý¾ÝÊDZ»Ï¸Ãܵشò°üÔÚÒ»ÆðµÄ£¬Òò´Ëÿ¸öÏß³Ì×îÏȽâÂëµÄÆðʼλλÖò¢²»Ã÷È·¡£ 2. ³ýÁ˵ÚÒ»¸öÏß³ÌÖ®Í⣬ÆäËûÏß³ÌËùÒª½âÂëµÄÔªËØµÄË÷ÒýÊÇδ֪µÄ£¬Õâµ¼ÖÂÄÑÒÔÈ·¶¨ÓÃÓÚ´æ´¢½âÂëЧ¹ûµÄ׼ȷÊä³öλÖᣠΪÏàʶ¾öµÚÒ»¸öÎÊÌ⣬ÍŶÓʹÓÃÒ»¸ö¼ä϶Êý×éÀ´È·¶¨Ã¿¸öÏß³ÌµÄÆðʼλλÖᣠÕâ¸ö¼ä϶Êý×éGapsΪÿ¸öḬ̈߳üÀ¨Ò»¸öÌõÄ¿£¬Ã¿¸öÌõÄ¿¶¼Ö¸¶¨ÁËÏà¹ØÓÚ¸ÃÏß³ÌËù·ÖÅɵįðʼ×Ö½Ú£¬µÚÒ»¸öÓÐÓÃHuffman±àÂëµÄÎ»Æ«ÒÆÁ¿¡£ÓÉÓÚ×î´ó´úÂ볤¶ÈΪ32룬Òò´Ëÿ¸öÆ«ÒÆÁ¿µÄÖµ¶¼ÔÚ[0,31]¹æÄ£ÄÚ¡£ÎªÁ˰ü¹ÜÄÚ´æÐ§ÂÊ£¬ÍŶÓʹÓÃ5¸öλÀ´±àÂëÿ¸öÌõÄ¿¡£ ΪÏàʶ¾öµÚ¶þ¸öÎÊÌ⣬×îÖ±½ÓµÄÒªÁìÊÇά»¤Ò»¸öÊý×飬ÓÃÓڴ洢ÿ¸öÏß³ÌËù½âÂëµÄµÚÒ»¸öÔªËØµÄÊä³öλÖá£È»¶ø£¬ÕâÖÖÒªÌå»á´øÀ´ÖØ´óµÄ´æ´¢¿ªÏú¡£ ΪÁËïÔÌ´æ´¢¿ªÏú£¬ÍŶÓÖ»´æ´¢Ã¿¸öÏ߳̿éÖеÚÒ»¸öÔªËØµÄÊä³öλÖ㬶ø²»ÊǴ洢ÿ¸öÏ̵߳ÄÊä³öλÖᣠΪÁËÄܹ»Ê¹Óÿ鼶µÄÊä³öλÖÃÐÅÏ¢¾ÙÐнâÂ룬ÍŶӽÓÄÉÁËÒ»ÖÖÁ½½×¶ÎµÄKernelÉè¼Æ¡£ ÔÚµÚÒ»½×¶Î£¬Ò»¸öÏ߳̿éÄÚµÄËùÓÐÏ̲߳¢ÐеؽâÂë·ÖÅɸøËüÃǵÄÄDz¿·Ö±àÂëÐòÁУ¬¿ÉÊDz¢²»½«ÈκÎÊä³öЧ¹ûдÈ뵽ȫ¾ÖÄÚ´æÖС£È¡¶ø´úÖ®µÄÊÇ£¬Ã¿¸öÏ̻߳áÅÌËãËü½«Òª½âÂëµÄÔªËØµÄÊýÄ¿¡£ Íê³ÉÕâÒ»²½Ö®ºó£¬ÍŶÓͬ²½Í³Ò»¸öÏ߳̿éÄÚµÄËùÓÐỊ̈߳¬²¢Í¨¹ýÅÌËãǰ׺ºÍÀ´È·¶¨Ã¿¸öÏ̵߳ÄÊä³öλÖã¬ÅÌËãǰ׺ºÍµÄÆðʼλÖÃÊǸÃÏ߳̿éµÄÒÑÖªÊä³öλÖᣠÔÚµÚ¶þ½×¶Î£¬Ã¿¸öÏ̻߳áÖØÐ½âÂëÏàͬµÄÄDz¿·Ö±àÂëÐòÁУ¬ÕâÒ»´Î»á½«½âÂëºóµÄЧ¹ûдÈëµ½HBMÖÐ׼ȷµÄÊä³öλÖᣠΪÁË×èÖ¹ÔÚÕâÁ½¸ö½×¶ÎÖÐÖØ¸´»á¼ûHBM£¬ÍŶӽ«±àÂëºóµÄÖ¸ÊýÊý¾Ý¼ÓÔØµ½SRAMÖС£ Á½½×¶ÎKernelµÄα´úÂë Transformer Block¼¶½âѹËõ ÖÁ´Ë£¬¾ÍÓÐÁËÒ»Ì×ÍêÕûµÄÒªÁ죬¿ÉÒÔ¶Ô¾ÓÉìØ±àÂëµÄÖ¸Êý¾ÙÐдó¹æÄ£²¢ÐнâѹËõ¡£ LLMµÄÈ¨ÖØÒÔDFloat11ÃûÌô洢£¬Í¬Ê±»¹°üÀ¨ÇáÁ¿¼¶µÄ¸¨ÖúÊý¾Ý£ºÏ̼߳¶µÄ¼äÏ¶Æ«ÒÆÁ¿ÒÔ¼°¿é¼¶µÄÊä³öλÖã¬ÕâЩÊý¾ÝÓÃÓÚÈ·¶¨Ã¿¸öÏ̵߳ĶÁÈ¡ºÍдÈëλÖᣠÔÚÍÆÀíÀú³ÌÖУ¬Ñ¹ËõºóµÄÈ¨ÖØÊý¾ÝºÍÕâЩ¸¨Öú±äÁ¿¶¼ÍêȫפÁôÔÚGPUÉÏ¡£ µ±ÐèҪʹÓÃij¸öÈ¨ÖØ¾ØÕó¾ÙÐоØÕó³Ë·¨ÔËËãʱ£¬¸Ã¾ØÕó»á±»¶¯Ì¬µØ½âѹËõΪÔʼµÄBFloat16ÃûÌá£Ò»µ©¾ØÕó³Ë·¨ÔËËãÍê³É£¬Õâ¸öBFloat16ÃûÌõľØÕó»áÁ¬Ã¦±»ÑïÆú£¬ÒÔ½ÚÔ¼GPUÏÔ´æ¡£ ÔÚÏÖʵӦÓÃÖУ¬ÓÉÓÚµ¥¸öÈ¨ÖØ¾ØÕóµÄ³ß´çͨ³£Ïà¶Ô½ÏС£¬Òò´Ëµ¥¶À½âѹËõÒ»¸öÈ¨ÖØ¾ØÕóÍùÍùÎÞ·¨³ä·ÖʹÓÃGPU×ÊÔ´¡£ ÔÚDFloat11½âѹËõKernelÖУ¬½«Ã¿¸öÏ̴߳¦Öóͷ£µÄ×Ö½ÚÊýÉèÖÃΪn=8£¬Ã¿¸öÏ߳̿éÖеÄÏß³ÌÊýÉèÖÃΪT=256£¬Ï߳̿éµÄÊýÄ¿ÉèÖÃΪB=?|EncodedExponent|/(nT)?£¬ÆäÖÐ|EncodedExponent|ÌåÏÖ±àÂëºóµÄÖ¸ÊýÊý¾ÝËùÕ¼µÄ×Ü×Ö½ÚÊý¡£ Ëæ×ÅDFloat11ÃûÌõÄÈ¨ÖØÊý¾Ý³ß´çµÄÔöÌí£¬»áÓиü¶àµÄÏ߳̿鱻ʹÓÃÆðÀ´£¬´Ó¶ø¿ÉÒÔÌá¸ßÕûÌåµÄ½âѹËõÍÌÍÂÁ¿¡£ ͼ6չʾÁËÕâÖÖÕ÷Ïó£¬ËüÅú×¢½âѹËõµÄÍÌÍÂÁ¿»áËæ×žØÕó³ß´çµÄÔöÌí¶øÏÔÖøÌáÉý¡£ÎªÁ˳ä·ÖʹÓÃÕâÒ»ÌØÕ÷£¬Ñо¿ÍŶӽ¨Ò齫¶à¸ö¾ØÕóµÄ½âѹËõ²Ù×÷¾ÙÐÐÅú´¦Öóͷ££¬ÒÔ´ËÀ´Ìá¸ßÍÌÍÂÁ¿²¢Òþ²ØÑÓ³Ù¡£ ¸üÏêϸµØËµ£¬¿ÉÒÔ½«µ¥¸öTransformer BlockÄÚµÄËùÓÐDFloat11ÃûÌõÄÈ¨ÖØ¾ØÕóµÄ½âѹËõ²Ù×÷¾ÙÐÐÅú´¦Öóͷ£¡£ ÔÚTransformer BlockÖÐÖ´ÐÐÈκÎÅÌËã²Ù×÷֮ǰ£¬ÍŶÓÊ×ÏȽâѹËõÓëÆäÏà¹ØÁªµÄËùÓÐÈ¨ÖØÊý¾Ý¡£ÕâÖÖÒªÁìÄܹ»ÏÔÖø½µµÍ½âѹËõµÄÑÓ³Ù£¬²¢Ìá¸ßÕûÌåµÄÍÆÀíЧÂÊ¡£ ͼ5չʾÁËÔÚ²î±ðµÄÅú´¦Öóͷ£¾ÞϸÏ£¬Ê¹ÓÃDFloat11ѹËõµÄLlama-3.1-8B-InstructÄ£×ÓµÄÑÓ³Ùϸ·ÖÇéÐΡ£ ʵÑéЧ¹û DF11½«LLMѹËõÖÁ70%¾Þϸ ±í2չʾÁËDF11¶Ô¶àÖÖ×îÐÂLLMµÄѹËõ±ÈÂÊ¡£ ѹËõµÄÄ£×Ó°üÀ¨LLaMA3/3.1/3.3¡¢Qwen2.5¡¢QwQ¡¢Mistral Nemo/Small/Codestral¡¢Gemma2/3ÒÔ¼°DeepSeek-R1-Distilled¡£ ʵÑéЧ¹ûÏÔʾ£¬DF11¶ÔËùÓÐÄ£×ÓµÄѹËõ±ÈԼΪ70%£¬Ï൱ÓÚԼĪ11λµÄÓÐÓÃλ¿í¡£ DF11ѹËõÍêÈ«ÎÞËð Ñо¿ÍŶÓͨ¹ýһϵÁбê×¼»ù×¼²âÊÔÑéÖ¤ÁËDF11ѹËõµÄÎÞËðÌØÕ÷¡£ ÆÀ¹ÀʹÓÃlm_evaluation_harness¹¤¾ß¾ÙÐУ¬±¨¸æÁËMMLUºÍTruthfulQAµÄ׼ȷÂÊ£¬ÒÔ¼°WikiTextºÍC4µÄ´Ê¼¶ÒÉÐĶȡ£ Èç±í3Ëùʾ£¬Ñ¹ËõÄ£×ÓµÄ׼ȷÂʺÍÒÉÐĶÈÓëÔʼBF16Ä£×ÓÍêȫһÖ¡£ ΪÁ˽øÒ»²½ÑéÖ¤ÎÞËðÌØÕ÷£¬ËûÃǽ«DF11½âѹºóµÄBF16È¨ÖØ¾ØÕóÓë±í2Öи÷Ä£×ÓµÄÔÊ¼È¨ÖØ¾ØÕó¾ÙÐнÏÁ¿£¬È·ÈÏÁ½ÕßÔÚ±ÈÌØ¼¶ÉÏÍêÈ«Ïàͬ¡£ DF11ÔÚÍÆÀíЧÂÊÉÏÓâÔ½CPUÐ¶ÔØ Ñо¿ÍŶӽÏÁ¿ÁËDF11ºÍBF16Ä£×ÓÔÚ²î±ðÓ²¼þƽ̨ÉϵÄÍÆÀíЧÂÊ¡£ δѹËõµÄBF16Ä£×Óͨ³£»áÁè¼Ýµ¥¸öGPUµÄÏÔ´æÏÞÖÆ£¬¶øÎÞËðѹËõµÄDF11Ä£×ÓÔò²»»áÁè¼Ý¡£ ¹ØÓÚBF16Ä£×Ó£¬ÍŶӽ«Ä£×ӵĴ󲿷ÖÄÚÈݺÍÅÌËã±£±£´æGPUÉÏ£¬Í¬Ê±½«²¿·Ö×é¼þ¼°ÆäÏà¹ØÅÌËãÐ¶ÔØµ½CPUÉÏ¡£ Èçͼ3Ëùʾ£¬DF11Ä£×ÓʼÖÕÓÅÓÚ½ÓÄÉCPUÐ¶ÔØµÄBF16Ä£×Ó£¬ÑÓ³Ù½µµÍÁË1.85ÖÁ38.83±¶»òÍÌÍÂÁ¿¸ü¸ß¡£ DF11Ö§³Ö¸ü³¤µÄÉúÉú³¤¶È DF11ѹËõ´øÀ´µÄÏÔ´æ½ÚÔ¼²»µ«ïÔÌÁËÍÆÀíËùÐèµÄGPUÊýÄ¿£¬»¹Ö§³Ö¸ü³¤µÄÉúÉú³¤¶È¡£ ÔÚÍÆÀíÀú³ÌÖУ¬KV»º´æ»áËæ×ŽâÂëtokenÊýÄ¿µÄÔöÌí¶øÏßÐÔÔöÌí£¬ºÜ¿ì³ÉΪGPUÏÔ´æµÄÆ¿¾±¡£ ͼ4չʾÁËÔÚÅú¾ÞϸΪ1ʱ£¬DF11ºÍBF16Ä£×ÓÔÚÍÆÀíÀú³ÌÖÐËæ×ŽâÂëtokenÊýÄ¿ÔöÌíµÄGPUÏÔ´æÏûºÄÇéÐΡ£ ÈçͼËùʾ£¬DF11ѹËõÏÔÖøÑÓÉìÁËtokenÉúÉú³¤¶È£¬ÓëBF16Ä£×ÓÏà±È£¬ÔÚµÖ´ïGPUÏÔ´æÏÞÖÆÇ°ÄܽâÂë5.33ÖÁ13.17±¶µÄtokenÊýÄ¿¡£ ½áÂÛ ÔÚÕâÏîÊÂÇéÖУ¬Ñо¿Ö°Ô±Ìá³öÁ˶¯Ì¬³¤¶È¸¡µã£¨DFloat£©×÷ΪһÖÖÕë¶ÔLLMÈ¨ÖØµÄÎÞËðѹËõÊý¾ÝÃûÌá£DFloatÊÇÏÖÔÚΨÖðÒ»ÖÖ¼ÈÄÜïÔÌÏÔ´æÕ¼ÓÃÓÖ¼æÈݸßЧGPUÍÆÀíµÄÊý¾ÝÃûÌᣠÏêϸÀ´Ëµ£¬ËûÃÇʹÓÃ11λµÄDFloatÃûÌã¨DF11£©ÆÀ¹ÀÁ˶à¸öÈÈÃÅLLM£¬²¢Îª´ËÃûÌÿª·¢Á˶¨ÖƵÄGPUÄںˡ£ ʵÑéЧ¹ûÅú×¢£¬»ùÓÚDF11µÄѹËõÏÔÖø½µµÍÁËЧÀÍLLMµÄÓ²¼þÐèÇ󣬲¢ÇÒÔÚ´ó´ó¶¼ÏÖʵӦÓó¡¾°Ï£¬ËüËùÔöÌíµÄÌØÊâÅÌËã¼ç¸ºÒ²ÊÇ¿ÉÒÔ½ÓÊܵġ£ ²Î¿¼×ÊÁÏ£º https://arxiv.org/abs/2504.11651