Sound recording and encoding in MP3 format.

本文介绍了一种使用LAME库进行MP3编码的方法,并通过示例代码展示了如何从声卡录制音频并转换为MP3格式。此外还讨论了如何调整音量、位速率和采样率等参数。

Introduction

Have you ever tried to write something for recording sound from the sound card and encoding it in MP3 format? Not interesting? Well, to make stuff more interesting, have you ever tried to write an MP3 streaming, internet radio sever? I know, you'll say "What for? There are good and pretty much standard implementations like Icecast or SHOUcast". But, anyway, have you ever tried, at least, to dig a bit inside this entire kitchen or write anything similar for your soul? Well, that's what this article is about. Of course, we won't manage to cover all topics in one article; at the end, this may be tiresome. So, I will split the entire topic in a few articles, this one covering the recording and encoding process.

Background

Obviously, the first problem everyone encounters is the MP3 encoding itself. Trying to write something that will work properly isn't quite an easy task. So, I won't go too far and will stop at the LAME (Sourceforge) encoder, considered one of the best (one, not the only!). I am using version 3.97); those interested in having sources, feel free to download them from SourceForge (it's an open source project). The relevant "lame_enc.dll" is also included in the demo project (see the links at the top of this article).

The next problem is recording the sound from the soundcard. Well, with some luck, on Google, MSDN, and CodeProject, you can find many articles related to this topic. I should say that I am using the low level waveform-audio API (see the Windows Media Platform SDK, e.g., waveInOpen(...), mixerOpen(...), etc.).

So, let's go with the details now.

MP3 Encoding

Download the "mp3_stream_src.zip" file containing the sources (see the link to the sources at the top of this article). Inside it, you should find the "mp3_simple.h" file (see the INCLUDE folder after un-zipping). It contains the definition and implementation of the CMP3Simple class. This class is a wrapper of the LAME API, which I tried to design to make life a bit easier. I commented code as much as possible, and I hope those comments are good enough. All we need to know at this point:

  1. When instantiating a CMP3Simple object, we need to define the desired bitrate at what to encode the sound's samples, expected frequency of the sound's samples, and (if necessary to re-sample) the desired frequency of the encoded sound:
    // Constructor of the class accepts only three parameters.
    // Feel free to add more constructors with different parameters, 
    // if a better customization is necessary.
    //
    // nBitRate - says at what bitrate to encode the raw (PCM) sound
    // (e.g. 16, 32, 40, 48, ... 64, ... 96, ... 128, etc), see 
    // official LAME documentation for accepted values.
    // 
    // nInputSampleRate - expected input frequency of the raw (PCM) sound
    // (e.g. 44100, 32000, 22500, etc), see official LAME documentation
    // for accepted values.
    //
    // nOutSampleRate - requested frequency for the encoded/output 
    // (MP3) sound. If equal with zero, then sound is not
    // re-sampled (nOutSampleRate = nInputSampleRate).
    CMP3Simple(unsigned int nBitRate, unsigned int nInputSampleRate = 44100,
               unsigned int nOutSampleRate = 0);
    
  2. Encoding itself is performed via CMP3Simple::Encode(...).
    // This method performs encoding.
    //
    // pSamples - pointer to the buffer containing raw (PCM) sound to be 
    // encoded. Mind that buffer must be an array of SHORT (16 bits PCM stereo 
    // sound, for mono 8 bits PCM sound better to double every byte to obtain 
    // 16 bits).
    //
    // nSamples - number of elements in "pSamples" (SHORT). Not to be confused 
    // with buffer size which represents (usually) volume in bytes. See 
    // also "MaxInBufferSize" method.
    //
    // pOutput - pointer to the buffer that will receive encoded (MP3) sound, 
    // here we have bytes already. LAME says that if pOutput is not
    // cleaned before call, data in pOutput will be mixed with incoming 
    // data from pSamples.
    //
    // pdwOutput - pointer to a variable that will receive the 
    // number of bytes written to "pOutput". See also "MaxOutBufferSize" 
    // method.
    BE_ERR Encode(PSHORT pSamples, DWORD nSamples, PBYTE pOutput, 
                  PDWORD pdwOutput);
    

Recording from the soundcard

Similarly, after un-zipping the "mp3_stream_src.zip" file, inside the INCLUDE folder, you should find the "waveIN_simple.h" file. It contains the definitions and implementations for the CWaveINSimple, CMixer and CMixerLine classes. Those classes are wrappers for a sub-set of the waveform-audio API functions. Why just a sub-set? Because (I am lazy sometimes), they encapsulate only functionality associated with Wave In devices (recording). So, Wave Out devices (playback) are not captured (type "sndvol32 /r" from "Start->Run" to see what I mean). Check comments I added to each class to have a better picture of what they are doing. What we need to know at this point:

  1. One CWaveINSimple device has one CMixer which has zero or more CMixerLines.
  2. Constructors and destructors of all those classes are declared "private" (due design).
    • Objects of the CWaveINSimple class can not be instantiated directly, for that the CWaveINSimple::GetDevices() and CWaveINSimple::GetDevice(...) static methods are declared.
    • Objects of the CMixer class can not be instantiated directly, for that the CWaveINSimple::OpenMixer() method is declared.
    • Objects of the CMixerLine class can not be instantiated directly, for that the CMixer::GetLines() and CMixer::GetLine(...) methods are declared.
  • In order to capture and process further sound data, a class must inherit from the IReceiver abstract class and implement the IReceiver::ReceiveBuffer(...) method. Further, an instance of the IReceiver derivate is passed to CWaveINSimple via CWaveINSimple::Start(IReceiver *pReceiver).
    1.  
    // See CWaveINSimple::Start(IReceiver *pReceiver) below.
    // Instances of any class extending "IReceiver" will be able 
    // to receive raw (PCM) sound from an instance of the CWaveINSimple 
    // and process sound via own implementation of the "ReceiveBuffer" method.
    class IReceiver {
    public:
        virtual void ReceiveBuffer(LPSTR lpData, DWORD dwBytesRecorded) = 0;
    };
    ...
    class CWaveINSimple {
    private:
    ...
        // This method starts recording sound from the 
        // WaveIN device. Passed object (derivate from 
        // IReceiver) will be responsible for further 
        // processing of the sound data.
        void _Start(IReceiver *pReceiver);
    ...
    public:
    ...
        // Wrapper of the _Start() method, for the multithreading
        // version. This is the actual starter.
        void Start(IReceiver *pReceiver);
    ...
    };
    

    Let's see some examples.

    Examples

    1. How would we list all the Wave In devices in the system?
      const vector<CWaveINSimple*>& wInDevices = CWaveINSimple::GetDevices();
      UINT i;
      
      for (i = 0; i < wInDevices.size(); i++) {
          printf("%s/n", wInDevices[i]->GetName());
      }
    2. How would we list a Wave In device's lines (supposing that strDeviceName = e.g., "SoundMAX Digital Audio")?
      CWaveINSimple& WaveInDevice = CWaveINSimple::GetDevice(strDeviceName);
      CHAR szName[MIXER_LONG_NAME_CHARS];
      UINT j;
      
      try {
          CMixer& mixer = WaveInDevice.OpenMixer();
          const vector<CMixerLine*>& mLines = mixer.GetLines();
      
          for (j = 0; j < mLines.size(); j++) {
              // Useful when Line has non proper English name 
              ::CharToOem(mLines[j]->GetName(), szName);
              printf("%s/n", szName);
          }
      
          mixer.Close();
      }
      catch (const char *err) {
          printf("%s/n",err);
      }
    3. How would we record and encode in MP3 actually?

      First of all, we define a class like:

      class mp3Writer: public IReceiver {
      private:
          CMP3Simple    m_mp3Enc;
          FILE *f;
      
      public:
          mp3Writer(unsigned int bitrate = 128, 
                        unsigned int finalSimpleRate = 0): 
                m_mp3Enc(bitrate, 44100, finalSimpleRate) {
              f = fopen("music.mp3", "wb");
              if (f == NULL) throw "Can't create MP3 file.";
          };
      
          ~mp3Writer() {
              fclose(f);
          };
      
          virtual void ReceiveBuffer(LPSTR lpData, DWORD dwBytesRecorded) {
              BYTE    mp3Out[44100 * 4];
              DWORD    dwOut;
              m_mp3Enc.Encode((PSHORT) lpData, dwBytesRecorded/2, 
                                       mp3Out, &dwOut);
      
              fwrite(mp3Out, dwOut, 1, f);
          };
      };
      

      and (supposing that strLineName = e.g., "Microphone"):

      try {
          CWaveINSimple& device = CWaveINSimple::GetDevice(strDeviceName);
          CMixer& mixer = device.OpenMixer();
          CMixerLine& mixerline = mixer.GetLine(strLineName);
      
          mixerline.UnMute();
          mixerline.SetVolume(0);
          mixerline.Select();
          mixer.Close();
      
          mp3Writer *mp3Wr = new mp3Writer();
          device.Start((IReceiver *) mp3Wr);
          while( !_kbhit() ) ::Sleep(100);
              
          device.Stop();
          delete mp3Wr;
      }
      catch (const char *err) {
          printf("%s/n",err);
      }
      
      CWaveINSimple::CleanUp();

    Remark 1

    mixerline.SetVolume(0) is a pretty tricky point. For some sound cards, SetVolume(0) gives original (good) sound's quality, for others, SetVolume(100) does the same. However, you can find sound cards where SetVolume(15) is the best quality. I have no good advices here, just try and check.

    Remark 2

    Almost every sound card supports "Wave Out Mix" or "Stereo Mix" (the list is extensible) Mixer's Line. Recording from such a line (mixerline.Select()) will actually record everything going to the sound card's Wave Out (read "speakers"). So, leave WinAmp or Windows Media Player to play for a while, and start the application to record the sound at the same time, you'll see the result.

    Remark 3

    Rather than calling:

    mp3Writer *mp3Wr = new mp3Writer();

    it is also possible to instantiate an instance of the mp3Writer as following (see the class definition above):

    mp3Writer *mp3Wr = new mp3Writer(64, 32000);

    This will produce a final MP3 at a 64 Kbps bitrate and 32 Khz sample rate.

    Comments on using the demo application

    The demo application (see the links at the top of this article) is a console application supporting two command line options. Executing the application without specifying any of the command line options will simply print the usage guideline, e.g.:

    ...>mp3_stream.exe
    mp3_stream.exe -devices
            Will list WaveIN devices.
    
    mp3_stream.exe -device=<device_name>
            Will list recording lines of the WaveIN <device_name> device.
    
    mp3_stream.exe -device=<device_name> -line=<line_name> 
              [-v=<volume>] [-br=<bitrate>] [-sr=<samplerate>]
            Will record from the <line_name> 
            at the given voice <volume>, output <bitrate> (in Kbps)
            and output <samplerate> (in Hz).
    
            <volume>, <bitrate> and <samplerate> are optional parameters.
            <volume> - integer value between (0..100), defaults to 0 if not set.
            <bitrate> - integer value (16, 24, 32, .., 64, etc.), 
                            defaults to 128 if not set.
            <samplerate> - integer value (44100, 32000, 22050, etc.), 
                            defaults to 44100 if not set.

    Executing the application with the "-devices" command line option will print the names of the Wave In devices currently installed in the system, e.g.:

    ...>mp3_stream.exe -devices
    Realtek AC97 Audio
    

    Executing the application with the "-device=<device_name>" command line option will list all the lines of the selected Wave In device, e.g.:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio"
    Mono Mix
    Stereo Mix
    Aux
    TV Tuner Audio
    CD Player
    Line In
    Microphone
    Phone Line
    

    At the end, the application will start recording (and encoding) sound from the selected Wave In device/line (microphone in this example) when executing with the following command line options:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio" -line=Microphone
    
    Recording at 128Kbps, 44100Hz
    from Microphone (Realtek AC97 Audio).
    Volume 0%.
    
    hit <ENTER> to stop ...
    

    Recorded and encoded sound is saved in the "music.mp3" file, in the same folder from where you executed the application.

    If you want to record sound that is currently playing (e.g., AVI movie, or Video DVD, or ...) through the soundcard Wave Out, you can run the application with the following options:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio" "-line=Stereo Mix"
    

    However, this may be specific for my configuration only (also explained in the "Remark 2" above).

    You can specify additional command line parameters, e.g.:

    ...>mp3_stream.exe "-device=Realtek AC97 Audio" 
            "-line=Stereo Mix" -v=100 -br=32 -sr=32000

    This will set the line’s volume at 100%, and will produce the final MP3 at 32 Kbps and 32 Khz.

    Conclusion

    In this article, I covered couple of months I spent investigating MP3 encoding APIs and recording (capturing actually) sound going to the sound card's speakers. I used all this techniques for implementing an internet based radio station (MP3 streaming server). I found this topic very interesting, and decided to share some of my code. In one of my next articles, I will try to cover some of the aspects related to MP3 streaming and IO Completion Ports, but, until that time, I have to clean existing code, comment it, and prepare the article :).

  • License

    This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

    A list of licenses authors might use can be found here

     
内容概要:本文提出了一种基于非合作博弈理论的居民负荷分层调度模型,并结合双层鲸鱼优化算法(Two-level Whale Optimization Algorithm)进行高效求解,模型与算法均通过Matlab代码实现。研究针对电力系统中居民侧用电负荷的复杂调度问题,引入非合作博弈机制刻画各用户之间的利益竞争关系,实现负荷的分层优化分配;同时设计双层优化架构,上层优化资源配置,下层模拟用户自主决策行为,提升了模型的实用性与合理性。通过智能优化算法求解多层级、非凸非线性的博弈模型,有效提高了调度方案的收敛性与全局寻优能力,适用于现代智能电网中的需求侧管理与能源优化场景。; 适合人群:具备电力系统基础理论知识和Matlab编程能力,从事智能电网、能源优化调度、需求侧管理、博弈论应用等方向的科研人员、高校研究生及工程技术人员。; 使用场景及目标:①应用于居民区电力负荷的分层优化调度系统设计与仿真分析;②为非合作博弈在多主体能源系统建模中的应用提供方法论支持;③利用双层鲸鱼算法解决具有嵌套结构的复杂双层优化问题,提升求解效率与调度方案的可行性。; 阅读建议:建议读者结合提供的Matlab代码深入理解模型构建逻辑与算法实现流程,重点关注博弈模型的效用函数设计、纳什均衡求解思路以及双层优化结构的迭代机制,宜配合实际用电数据开展复现实验以验证模型有效性与鲁棒性。
内容概要:本文围绕基于自适应神经模糊推理系统(ANFIS)智能控制器的可再生能源微电网功率管理系统展开研究,结合Simulink仿真实现,深入探讨了微电网中功率的智能调控与经济机组组合调度问题。通过引入ANFIS控制器,有效应对风能、光伏等可再生能源出力的波动性与不确定性,提升系统运行的稳定性与电能质量。研究内容涵盖微电网多源协调控制策略、功率平衡管理、优化调度模型构建及仿真验证,实现了对分布式电源、储能系统和负荷的协同优化,兼顾经济性与可靠性目标,并通过仿真平台验证了所提方法的有效性与优越性。; 适合人群:具备电力系统、自动化或新能源相关专业背景,熟悉Matlab/Simulink仿真环境,从事微电网能量管理、智能控制、能源优化等领域研究的研究生、科研人员及工程技术人员。; 使用场景及目标:①用于高比例可再生能源接入场景下的微电网能量管理系统研发与教学实践;②为实现微电网功率稳定控制与经济高效运行提供先进的智能控制解决方案;③支撑高水平学术论文复现、科研课题攻关及实际工程项目的仿真验证与方案优化。; 阅读建议:建议结合提供的Simulink模型与相关代码进行动手实践,重点关注ANFIS控制器的设计流程、规则库构建与参数调优方法,并通过与传统PID或MPC控制策略的对比实验,深入理解其在动态响应与鲁棒性方面的优势。同时可进一步拓展文中提出的优化调度逻辑,应用于多目标、多约束的复杂实际应用场景中。
内容概要:本文档聚焦于“直流电机双闭环控制Matlab仿真”,系统阐述了基于Matlab/Simulink平台实现直流电机双闭环控制系统(主要包括速度环与电流环)的设计与仿真全过程。通过构建直流电机的数学模型,结合PI控制器进行调控,实现对电机转速和电枢电流的高精度动态控制,验证控制策略的稳定性与响应性能。文档详细介绍了仿真模型的搭建流程、关键参数的整定方法、系统动态波形的分析手段以及仿真结果的有效性验证,体现了经典自动控制理论在实际电机系统中的工程应用,是电机控制与电力电子技术相结合的典型研究案例。; 适合人群:具备自动控制原理、电机与拖动基础、电力电子技术和Matlab/Simulink仿真能力的电气工程、自动化、机电一体化等专业的本科生、研究生及从事电机驱动系统研发的工程技术人员。; 使用场景及目标:①作为高校课程设计或实验教学材料,帮助学生深入理解双闭环调速系统的工作机理与工程实现;②服务于科研项目,为新型电机控制算法(如滑模、模糊PID等)的开发与性能对比提供基础仿真验证平台;③作为工业界产品前期设计的仿真工具,用于评估不同控制策略在动态响应、抗干扰能力和稳态精度方面的可行性。; 阅读建议:建议读者在学习过程中紧密结合自动控制理论知识,亲手在Simulink环境中搭建完整的双闭环仿真模型,通过反复调整PI控制器的比例与积分参数,观察并分析转速、电流的阶跃响应曲线,从而深刻理解反馈控制的本质、系统稳定性条件以及参数整定对动态性能的影响,进而掌握电机控制系统的设计精髓。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值