• Download source files - 18.7 Kb
  • Download demo project - 185 Kb

Introduction

Have you ever tried to write something for recording sound from the sound card and encoding it in MP3 format? Not interesting? Well, to make stuff more interesting, have you ever tried to write an MP3 streaming, internet radio sever? I know, you'll say "What for? There are good and pretty much standard implementations like Icecast or SHOUcast". But, anyway, have you ever tried, at least, to dig a bit inside this entire kitchen or write anything similar for your soul? Well, that's what this article is about. Of course, we won't manage to cover all topics in one article; at the end, this may be tiresome. So, I will split the entire topic in a few articles, this one covering the recording and encoding process.

Background

Obviously, the first problem everyone encounters is the MP3 encoding itself. Trying to write something that will work properly isn't quite an easy task. So, I won't go too far and will stop at the LAME (Sourceforge) encoder, considered one of the best (one, not the only!). I am using version 3.97); those interested in having sources, feel free to download them from SourceForge (it's an open source project). The relevant "lame_enc.dll" is also included in the demo project (see the links at the top of this article).

The next problem is recording the sound from the soundcard. Well, with some luck, on Google, MSDN, and CodeProject, you can find many articles related to this topic. I should say that I am using the low level waveform-audio API (see the Windows Media Platform SDK, e.g., waveInOpen(...), mixerOpen(...), etc.).

So, let's go with the details now.

MP3 Encoding

Download the "mp3_stream_src.zip" file containing the sources (see the link to the sources at the top of this article). Inside it, you should find the "mp3_simple.h" file (see the INCLUDE folder after un-zipping). It contains the definition and implementation of the CMP3Simple class. This class is a wrapper of the LAME API, which I tried to design to make life a bit easier. I commented code as much as possible, and I hope those comments are good enough. All we need to know at this point:

  1. When instantiating a CMP3Simple object, we need to define the desired bitrate at what to encode the sound's samples, expected frequency of the sound's samples, and (if necessary to re-sample) the desired frequency of the encoded sound:

    // Constructor of the class accepts only three parameters.
    // Feel free to add more constructors with different parameters,
    // if a better customization is necessary.
    //
    // nBitRate - says at what bitrate to encode the raw (PCM) sound
    // (e.g. 16, 32, 40, 48, ... 64, ... 96, ... 128, etc), see
    // official LAME documentation for accepted values.
    //
    // nInputSampleRate - expected input frequency of the raw (PCM) sound
    // (e.g. 44100, 32000, 22500, etc), see official LAME documentation
    // for accepted values.
    //
    // nOutSampleRate - requested frequency for the encoded/output
    // (MP3) sound. If equal with zero, then sound is not
    // re-sampled (nOutSampleRate = nInputSampleRate).
    CMP3Simple(unsigned int nBitRate, unsigned int nInputSampleRate = 44100,
    unsigned int nOutSampleRate = 0);
    

  2. Encoding itself is performed via CMP3Simple::Encode(...).
    // This method performs encoding.
    //
    // pSamples - pointer to the buffer containing raw (PCM) sound to be
    // encoded. Mind that buffer must be an array of SHORT (16 bits PCM stereo
    // sound, for mono 8 bits PCM sound better to double every byte to obtain
    // 16 bits).
    //
    // nSamples - number of elements in "pSamples" (SHORT). Not to be confused
    // with buffer size which represents (usually) volume in bytes. See
    // also "MaxInBufferSize" method.
    //
    // pOutput - pointer to the buffer that will receive encoded (MP3) sound,
    // here we have bytes already. LAME says that if pOutput is not
    // cleaned before call, data in pOutput will be mixed with incoming
    // data from pSamples.
    //
    // pdwOutput - pointer to a variable that will receive the
    // number of bytes written to "pOutput". See also "MaxOutBufferSize"
    // method.
    BE_ERR Encode(PSHORT pSamples, DWORD nSamples, PBYTE pOutput,
    PDWORD pdwOutput);
    

Recording from the soundcard

Similarly, after un-zipping the "mp3_stream_src.zip" file, inside the INCLUDE folder, you should find the "waveIN_simple.h" file. It contains the definitions and implementations for the CWaveINSimple, CMixer and CMixerLine classes. Those classes are wrappers for a sub-set of the waveform-audio API functions. Why just a sub-set? Because (I am lazy sometimes), they encapsulate only functionality associated with Wave In devices (recording). So, Wave Out devices (playback) are not captured (type "sndvol32 /r" from "Start->Run" to see what I mean). Check comments I added to each class to have a better picture of what they are doing. What we need to know at this point:

  1. One CWaveINSimple device has one CMixer which has zero or more CMixerLines.
  2. Constructors and destructors of all those classes are declared "private" (due design).
    • Objects of the CWaveINSimple class can not be instantiated directly, for that the CWaveINSimple::GetDevices() and CWaveINSimple::GetDevice(...) static methods are declared.
    • Objects of the CMixer class can not be instantiated directly, for that the CWaveINSimple::OpenMixer() method is declared.
    • Objects of the CMixerLine class can not be instantiated directly, for that the CMixer::GetLines() and CMixer::GetLine(...) methods are declared.
  • In order to capture and process further sound data, a class must inherit from the IReceiver abstract class and implement the IReceiver::ReceiveBuffer(...) method. Further, an instance of the IReceiver derivate is passed to CWaveINSimple via CWaveINSimple::Start(IReceiver *pReceiver).

    // See CWaveINSimple::Start(IReceiver *pReceiver) below.
    // Instances of any class extending "IReceiver" will be able
    // to receive raw (PCM) sound from an instance of the CWaveINSimple
    // and process sound via own implementation of the "ReceiveBuffer" method.
    class IReceiver {
    public:
    virtual void ReceiveBuffer(LPSTR lpData, DWORD dwBytesRecorded) = 0;
    };
    ...
    class CWaveINSimple {
    private:
    ...
    // This method starts recording sound from the
        // WaveIN device. Passed object (derivate from
        // IReceiver) will be responsible for further
        // processing of the sound data.
        void _Start(IReceiver *pReceiver);
    ...
    public:
    ...
    // Wrapper of the _Start() method, for the multithreading
        // version. This is the actual starter.
        void Start(IReceiver *pReceiver);
    ...
    };
    

    Let's see some examples.

    Examples

    1. How would we list all the Wave In devices in the system?

      const vector<CWaveINSimple*>& wInDevices = CWaveINSimple::GetDevices();
      UINT i;
      for (i = 0; i < wInDevices.size(); i++) {
      printf("%s/n", wInDevices[i]->GetName());
      }

    2. How would we list a Wave In device's lines (supposing that strDeviceName = e.g., "SoundMAX Digital Audio")?
      CWaveINSimple& WaveInDevice = CWaveINSimple::GetDevice(strDeviceName);
      CHAR szName[MIXER_LONG_NAME_CHARS];
      UINT j;
      try {
      CMixer& mixer = WaveInDevice.OpenMixer();
      const vector<CMixerLine*>& mLines = mixer.GetLines();
      for (j = 0; j < mLines.size(); j++) {
      // Useful when Line has non proper English name
              ::CharToOem(mLines[j]->GetName(), szName);
      printf("%s/n", szName);
      }
      mixer.Close();
      }
      catch (const char *err) {
      printf("%s/n",err);
      }

    3. How would we record and encode in MP3 actually?

      First of all, we define a class like:

      class mp3Writer: public IReceiver {
      private:
      CMP3Simple    m_mp3Enc;
      FILE *f;
      public:
      mp3Writer(unsigned int bitrate = 128,
      unsigned int finalSimpleRate = 0):
      m_mp3Enc(bitrate, 44100, finalSimpleRate) {
      f = fopen("music.mp3", "wb");
      if (f == NULL) throw "Can't create MP3 file.";
      };
      ~mp3Writer() {
      fclose(f);
      };
      virtual void ReceiveBuffer(LPSTR lpData, DWORD dwBytesRecorded) {
      BYTE    mp3Out[44100 * 4];
      DWORD    dwOut;
      m_mp3Enc.Encode((PSHORT) lpData, dwBytesRecorded/2,
      mp3Out, &dwOut);
      fwrite(mp3Out, dwOut, 1, f);
      };
      };
      

      and (supposing that strLineName = e.g., "Microphone"):

      try {
      CWaveINSimple& device = CWaveINSimple::GetDevice(strDeviceName);
      CMixer& mixer = device.OpenMixer();
      CMixerLine& mixerline = mixer.GetLine(strLineName);
      mixerline.UnMute();
      mixerline.SetVolume(0);
      mixerline.Select();
      mixer.Close();
      mp3Writer *mp3Wr = new mp3Writer();
      device.Start((IReceiver *) mp3Wr);
      while( !_kbhit() ) ::Sleep(100);
      device.Stop();
      delete mp3Wr;
      }
      catch (const char *err) {
      printf("%s/n",err);
      }
      CWaveINSimple::CleanUp();

    Remark 1

    mixerline.SetVolume(0) is a pretty tricky point. For some sound cards, SetVolume(0) gives original (good) sound's quality, for others, SetVolume(100) does the same. However, you can find sound cards where SetVolume(15) is the best quality. I have no good advices here, just try and check.

    Remark 2

    Almost every sound card supports "Wave Out Mix" or "Stereo Mix" (the list is extensible) Mixer's Line. Recording from such a line (mixerline.Select()) will actually record everything going to the sound card's Wave Out (read "speakers"). So, leave WinAmp or Windows Media Player to play for a while, and start the application to record the sound at the same time, you'll see the result.

    Remark 3

    Rather than calling:

    mp3Writer *mp3Wr = new mp3Writer();

    it is also possible to instantiate an instance of the mp3Writer as following (see the class definition above):

    mp3Writer *mp3Wr = new mp3Writer(64, 32000);

    This will produce a final MP3 at a 64 Kbps bitrate and 32 Khz sample rate.

    Comments on using the demo application

    The demo application (see the links at the top of this article) is a console application supporting two command line options. Executing the application without specifying any of the command line options will simply print the usage guideline, e.g.:

    ...>mp3_stream.exe
    mp3_stream.exe -devices
    Will list WaveIN devices.
    mp3_stream.exe -device=<device_name>
    Will list recording lines of the WaveIN <device_name> device.
    mp3_stream.exe -device=<device_name> -line=<line_name>
    [-v=<volume>] [-br=<bitrate>] [-sr=<samplerate>]
    Will record from the <line_name>
    at the given voice <volume>, output <bitrate> (in Kbps)
    and output <samplerate> (in Hz).
    <volume>, <bitrate> and <samplerate> are optional parameters.
    <volume> - integer value between (0..100), defaults to 0 if not set.
    <bitrate> - integer value (16, 24, 32, .., 64, etc.),
    defaults to 128 if not set.
    <samplerate> - integer value (44100, 32000, 22050, etc.),
    defaults to 44100 if not set.

    Executing the application with the "-devices" command line option will print the names of the Wave In devices currently installed in the system, e.g.:

    ...>mp3_stream.exe -devices
    Realtek AC97 Audio
    

    Executing the application with the "-device=<device_name>" command line option will list all the lines of the selected Wave In device, e.g.:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio"
    Mono Mix
    Stereo Mix
    Aux
    TV Tuner Audio
    CD Player
    Line In
    Microphone
    Phone Line
    

    At the end, the application will start recording (and encoding) sound from the selected Wave In device/line (microphone in this example) when executing with the following command line options:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio" -line=Microphone
    Recording at 128Kbps, 44100Hz
    from Microphone (Realtek AC97 Audio).
    Volume 0%.
    hit <ENTER> to stop ...
    

    Recorded and encoded sound is saved in the "music.mp3" file, in the same folder from where you executed the application.

    If you want to record sound that is currently playing (e.g., AVI movie, or Video DVD, or ...) through the soundcard Wave Out, you can run the application with the following options:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio" "-line=Stereo Mix"
    

    However, this may be specific for my configuration only (also explained in the "Remark 2" above).

    You can specify additional command line parameters, e.g.:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio"
    "-line=Stereo Mix" -v=100 -br=32 -sr=32000

    This will set the line’s volume at 100%, and will produce the final MP3 at 32 Kbps and 32 Khz.

    Conclusion

    In this article, I covered couple of months I spent investigating MP3 encoding APIs and recording (capturing actually) sound going to the sound card's speakers. I used all this techniques for implementing an internet based radio station (MP3 streaming server). I found this topic very interesting, and decided to share some of my code. In one of my next articles, I will try to cover some of the aspects related to MP3 streaming and IO Completion Ports, but, until that time, I have to clean existing code, comment it, and prepare the article :).

  • License

    This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

    A list of licenses authors might use can be found here

Sound recording and encoding in MP3 format.相关推荐

  1. python音频处理库_Python中音频处理库pydub的使用教程

    前言 pydub是Python中用户处理音频文件的一个库.本文主要介绍了关于Python音频处理库pydub使用的相关内容,分享出来供大家参考学习,下面来看看详细的介绍: 安装: 1.安装pip工具: ...

  2. pydub 中文文档

    转载于:https://blog.csdn.net/baidu_29198395/article/details/86694365#t5 内容有一定删减 pydub 中文文档(含API) 0x00 写 ...

  3. 最好用的python音频库之一:pydub的中文文档(含API)

    pydub 中文文档(含API) 0x00 写在最前 Pydub lets you do stuff to audio in a way that isn't stupid. pydub 提供了简洁的 ...

  4. 写个mp3播放器 - flash.media.sound

    包 flash.media 类 public class Sound 继承 Sound -> EventDispatcher -> Object 语言版本 :  ActionScript ...

  5. flac格式转换mp3格式_MP3,FLAC和其他音频格式之间有什么区别?

    flac格式转换mp3格式 Digital audio has been around a very long time so there's bound to be a plethora of au ...

  6. mp3转换wav文件_如何将WAV文件转换为MP3

    mp3转换wav文件 WAV audio files are a great way to preserve the complete and accurate quality of a record ...

  7. 【Python】 获取MP3信息replica

    replica 初衷是想要整理iphone中的音乐.IOS(我自己的手机还是IOS8.3,新版本的系统可能有变化了)自带的音乐软件中所有音乐文件都存放在/var/mobile/Media/iTunes ...

  8. 使用python判断流媒体mp3格式

    项目中使用mp3格式进行音效播放,遇到一个mp3文件在程序中死活播不出声音,最后发现它是wav格式的文件,却以mp3结尾.要对资源进行mp3格式判断,那么如何判断呢,用.mp3后缀肯定不靠谱,我们知道 ...

  9. Flex mp3播放

    mp3 播放 // 方案一<?xml version="1.0" encoding="utf-8"?> <mx:Application xml ...

最新文章

  1. 学python适合什么年龄段的人用_7个现在就该学习Python 的理由【80%的人都不知道】...
  2. VirtualBox 无权限问题
  3. 基于VGG的感知损失函数--人眼感知的loss
  4. JavaScript之jQuery够用即可(查找筛选器、属性操作、jQuery文档处理)
  5. android 模拟crash_Android 收集Crash信息及用户操作步骤
  6. 【摘录】Android画图之抗锯齿
  7. dbcc收缩数据库_使用DBCC SHRINKFILE收缩数据库
  8. 大前端时代下,如何成为一名优秀的程序员?
  9. golang基础-etcd介绍与使用、etcd存取值、etcd监测数据写入
  10. 分享21个精美的博客网站设计案例
  11. Permission denied (publickey) 解决方案
  12. WIN32汇编 状态栏的使用
  13. 数字经济发展指标体系和测算(含互联网宽带、电话普及率等多指标 内附原始数据) 2011-2020年
  14. 25个深度学习相关公开数据集
  15. solr配置索引库启动tomcat报错记录及解决
  16. springboot整合mongodb
  17. windows系统VS code coderunner 运行shell脚本
  18. CSS精灵优化Retina显示屏下的网站图像
  19. 【ASP.NET MVC系列】浅谈Google Chrome浏览器(操作篇)(上)
  20. 50 OHM阻抗线设计

热门文章

  1. DNS服务器systemctl start named启动失败
  2. JMS学习一(JMS介绍)
  3. 最优化学习笔记(十)——对偶线性规划
  4. 分享一篇关于饿了么的需求文档
  5. 物联网生态品牌白皮书
  6. 美国大学计算机专业排名2014,2014年美国大学计算机科学专业排名
  7. 作者:刘昂(1990-),男,中国科学院计算机网络信息中心工程师
  8. 【2015年第4期】面向科技情报的互联网信息源自动发现技术
  9. spring 多数据源-实现
  10. 【Java】翻转字符串中的每个单词