{"id":8379,"date":"2025-02-18T15:32:20","date_gmt":"2025-02-18T13:32:20","guid":{"rendered":"https:\/\/www.tonmeister.ca\/wordpress\/?p=8379"},"modified":"2025-02-19T15:09:09","modified_gmt":"2025-02-19T13:09:09","slug":"bit-depth-conversion-part-4","status":"publish","type":"post","link":"https:\/\/www.tonmeister.ca\/wordpress\/2025\/02\/18\/bit-depth-conversion-part-4\/","title":{"rendered":"Bit depth conversion: Part 4"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Converting floating point to fixed point<\/h2>\n\n\n\n<p>It is often the case that you have to convert a floating point representation to a fixed point representation. For example, you&#8217;re doing some signal processing like changing the volume or adding equalisation, and you want to output the signal to a DAC or a digital output.<\/p>\n\n\n\n<p>The easiest way to do this is to just send the floating point signal into the DAC or the S\/PDIF transmitter and let it look after things. However, in my experience, you can&#8217;t always trust this. (I&#8217;ll explain why in a later posting in this series.) So, if you&#8217;re a geek like me, then you do this conversion yourself in advance to ensure you&#8217;re getting what you think you&#8217;re getting.<\/p>\n\n\n\n<p>To start, we&#8217;ll assume that, in the floating point world, you have ensured that your signal is scaled in level to have a maximum amplitude of \u00b1 1.0. In floating point, it&#8217;s possible to go much higher than this, and there&#8217;re no serious reason to worry going much lower (see <a href=\"https:\/\/www.tonmeister.ca\/wordpress\/2021\/07\/19\/fixed-point-vs-floating-point\/\" data-type=\"post\" data-id=\"6865\">this posting<\/a>). However, we work with the assumption that we&#8217;re around that level.<\/p>\n\n\n\n<p>So, if you have a 0 dB FS sine wave in floating point, then its maximum and minimum will hit \u00b11.0.<\/p>\n\n\n\n<p>Then, we have to convert that signal with a range of \u00b11.0 to a fixed point system that, as we already know, is asymmetrical. This means that we have to be a little careful about how we scale the signal to avoid clipping on the positive side. We do this by multiplying the \u00b11.0 signal by 2^(nBits-1)-1 if the signal is not dithered. (Pay heed to that &#8220;-1&#8221; at the end of the multiplier.)<\/p>\n\n\n\n<p>Let&#8217;s do an example of this, using a 5-bit output to keep things on a human scale. We take the floating point values and multiply each of them by 2^(5-1)-1 (or 15). We then round the signals to the nearest integer value and save this as a two&#8217;s complement binary value. This is shown below in Figure 1.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"589\" height=\"1024\" src=\"https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig01-589x1024.png\" alt=\"\" class=\"wp-image-8380\" srcset=\"https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig01-589x1024.png 589w, https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig01-172x300.png 172w, https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig01-768x1336.png 768w, https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig01-883x1536.png 883w, https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig01.png 938w\" sizes=\"auto, (max-width: 589px) 100vw, 589px\" \/><figcaption class=\"wp-element-caption\">Figure 1. Converting floating point to a 5-bit fixed point value without dither.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>As should be obvious from Figure 1, we will never hit the bottom-most fixed point quantisation level (unless the signal is asymmetrical and actually goes a little below -1.0).<\/p>\n\n\n\n<p>If you choose to dither your audio signal, then you&#8217;re adding a white noise signal with an amplitude of \u00b11 quantisation level after the floating point signal is scaled and before it&#8217;s rounded. This means that you need one extra quantisation level of headroom to avoid clipping as a result of having added the dither. Therefore, you have to multiply the floating point value by 2^(nBits-1)-2 instead (notice the &#8220;-2&#8221; at the end there&#8230;) This is shown below in Figure 2.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"614\" height=\"1024\" src=\"http:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig02-614x1024.png\" alt=\"\" class=\"wp-image-8384\" srcset=\"https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig02-614x1024.png 614w, https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig02-180x300.png 180w, https:\/\/www.tonmeister.ca\/wordpress\/wp-content\/uploads\/bit_depth_pt04_fig02.png 623w\" sizes=\"auto, (max-width: 614px) 100vw, 614px\" \/><figcaption class=\"wp-element-caption\">Figure 2. Converting floating point to a 5-bit fixed point value with dither.<\/figcaption><\/figure>\n<\/div>\n\n\n<p>Of course, you can choose to not dither the signal. Dither was a really useful thing back in the days when we only had 16 reliable bits to work with. However, now that 24-bit signals are normal, dither is not really a concern.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Converting floating point to fixed point It is often the case that you have to convert a floating point representation to a fixed point representation. For example, you&#8217;re doing some signal processing like changing the volume or adding equalisation, and you want to output the signal to a DAC or a digital output. The easiest [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[63,4,59,43],"tags":[],"class_list":["post-8379","post","type-post","status-publish","format-standard","hentry","category-analysis","category-audio","category-digital-audio","category-dsp"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p48hIM-2b9","_links":{"self":[{"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/posts\/8379","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/comments?post=8379"}],"version-history":[{"count":2,"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/posts\/8379\/revisions"}],"predecessor-version":[{"id":8385,"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/posts\/8379\/revisions\/8385"}],"wp:attachment":[{"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/media?parent=8379"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/categories?post=8379"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tonmeister.ca\/wordpress\/wp-json\/wp\/v2\/tags?post=8379"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}