问题的缘起是芒果在使用mochiweb的过程中遇到一个异常,在google的讨论组中我找到了同样的问题:

=ERROR REPORT==== 7-Apr-2011::18:58:22 === "web request failed" path: "cfsp/entity" type: error what: badarg trace: [{erlang,iolist_size,
[[...]]},
{mochiweb_request,respond,2},
{rest_server_web,loop,1},
{mochiweb_http,headers,5},
{proc_lib,init_p_do_apply,3}]

提问者遇到这个异常后判断是文档超长造成的,bob在下面的回复首先否定了这个猜测,并把关注点放在了trace信息中明确提示出来的iolist异常上面,他的回复:

I don't think it has anything to do with the size of your document,

your code is somehow returning a value that is not an iolist. Perhaps there is an atom in there, or an integer outside of 0..255 (I guess this is more likely, I don't know xmerl output very well).

1> iolist_size([256]).
** exception error: bad argument in function iolist_size/1
called as iolist_size([256])

You probably want UTF-8, so unicode:characters_to_binary(xmerl:export_simple(Res,xmerl_xml)) is my

guess at what you really want to be doing for the output.

-bob

详情点击:http://groups.google.com/group/mochiweb/browse_thread/thread/f67abc113b338bfe?pli=1

按照这个提示给芒果,果然就把问题解决掉了;问题到这里还不能结束,追问一下,Erlang的List和IOList有什么区别?

iolist定义

在erlang官方文档中iolist描述甚少,不过还是可以找到定义:
iodata() = iolist() | binary()
iolist()     maybe_improper_list(char() | binary() | iolist(), binary() | [])
maybe_improper_list()     maybe_improper_list(any(), any())
byte()     0..255
char()     0..16#10ffffmaybe_improper_list(T)     maybe_improper_list(T, any())

或者:

IoData = unicode:chardata()
chardata() = charlist() | unicode_binary()
charlist() = [unicode_char() | unicode_binary() | charlist()]
unicode_binary() = binary()

A binary() with characters encoded in the UTF-8 coding standard.

注意iolist相关的两个函数,他们接收的参数还可以是binary
iolist_size(Item) -> integer() >= 0

Types:Item = iolist() | binary() 

iolist_to_binary(IoListOrBinary) -> binary()

Types:IoListOrBinary = iolist() | binary()
官方文档地址:http://www.erlang.org/doc/reference_manual/typespec.html
我们动手测试一下:
 Eshell V5.9  (abort with ^G)1> iolist_size([]).02> iolist_size([<<"anc">>]).33> iolist_size([12,<<"anc">>]).44> iolist_size([12,<<"anc">>,23]).55> iolist_size([12,<<"anc">>,23,<<"king">>]).96> iolist_size([12,<<"anc">>,23,<<"king">>,[23,34,<<"test">>]]).157> iolist_size(<<"abc">>).38> iolist_size(<<>>).09> iolist_size([1234]).
** exception error: bad argumentin function  iolist_size/1
called as iolist_size([1234])10> iolist_size([<<1:1>>]).
** exception error: bad argumentin function  iolist_size/1
called as iolist_size([<<1:1>>])11> iolist_size( [12,23,"abc",<<abc>>]).
** exception error: bad argument12> iolist_size( [12,23,<<abc>>]).
** exception error: bad argument13> iolist_size( [12,23,"abc",<<"abc">>]).814>  L=[$H, $e, [$l, <<"lo">>, " "], [[["W","o"], <<"rl">>]] | [<<"d">>]].
[72,101,[108,<<"lo">>," "],[[["W","o"],<<"rl">>]],<<"d">>]

iolist适用的场景?

首先能够找到的是mryufeng的这篇《iolist跟list有什么区别?》 http://mryufeng.iteye.com/blog/634867
这篇文章分析源码得到了iolist数据结构的定义,并在解释了iolist的作用:
Iolist的作用是用于往port送数据的时候.由于底层的系统调用如writev支持向量写, 就避免了无谓的iolist_to_binary这样的扁平话操作, 避免了内存拷贝,极大的提高了效率.建议多用.
这个是什么意思呢? 在Learn you some Erlang站点上<<Buckets of Sockets>>一文的开篇我找到了答案:
A = [a]
B = [b|A] = [b,a]
C = [c|B] = [c,b,a]
In the case of prepending, as above, whatever is held into A or B or C never needs to be rewritten. The representation of C can be seen as either [c,b,a], [c|B] or [c,|[b|[a]]], among others. In the last case, you can see that the shape of A is the same at the end of the list as when it was declared. Similarly for B. Here's how it looks with appending:

A = [a]
B = A ++ [b] = [a] ++ [b] = [a|[b]]
C = B ++ [c] = [a|[b]] ++ [c] = [a|[b|[c]]]
Do you see all that rewriting? When we create B, we have to rewrite A. When we write C, we have to rewrite B (including the [a|...] part it contains). If we were to add D in a similar manner, we would need to rewrite C. Over long strings, this becomes way too inefficient, and it creates a lot of garbage left to be cleaned up by the Erlang VM.

With binaries, things are not exactly as bad:

A = <<"a">>
B = <<A/binary, "b">> = <<"ab">>
C = <<B/binary, "c">> = <<"abc">>
In this case, binaries know their own length and data can be joined in constant time. That's good, much better than lists. They're also more compact. For these reasons, we'll often try to stick to binaries when using text in the future.

There are a few downsides, however. Binaries were meant to handle things in certain ways, and there is still a cost to modifying binaries, splitting them, etc. Moreover, sometimes we'll work with code that uses strings, binaries, and individual characters interchangeably. Constantly converting between types would be a hassle.

In these cases, IO lists are our saviour. IO lists are a weird type of data structure. They are lists of either bytes (integers from 0 to 255), binaries, or other IO lists. This means that functions that accept IO lists can accept items such as [$H, $e, [$l, <<"lo">>, " "], [[["W","o"], <<"rl">>]] | [<<"d">>]]. When this happens, the Erlang VM will just flatten the list as it needs to do it to obtain the sequence of characters Hello World.

What are the functions that accept such IO Lists? Most of the functions that have to do with outputting data do. Any function from the io module, file module, TCP and UDP sockets will be able to handle them. Some library functions, such as some coming from the unicode module and all of the functions from the re (for regular expressions) module will also handle them, to name a few.

Try the previous Hello World IO List in the shell with io:format("~s~n", [IoList]) just to see. It should work without a problem.

All in all, they're a pretty clever way of building strings to avoid the problems of immutable data structures when it comes to dynamically building content to be output.

简单说明一下上面的内容:
|->如果是在List头部追加内容是非常快速的,但是在List尾部追加内容就要进行遍历
-> 使用binary数据可以在常量时间内完成尾部追加,但是问题:①修改和split存在消耗 ;②字符和二进制数据的常量转换
-> iolist对这种数据混搭有一个较好的支持,Erlang VM会将list平铺,可以使用io:format来检验各种数据构成的iolist输出之后的结果
-> 总结 iolist是单次赋值约束下,动态构建字符串内容输出的好方法;
我们可以通过erlc +\'to_core\' M.erl 的方法(参见:[Erlang 0029] Erlang Inline编译)查看一下iolist的 Core Erlang表示:
在Core Erlang中List = [1,2,3,4,5,6,7,8,9],会被表示为:[1|[2|[3|[4|[5|[6|[7|[8|[9]]]]]]]]]
看下
 L=[$H, $e, [$l, <<"lo">>, " "], [[["W","o"], <<"rl">>]] | [<<"d">>]],
iolist_size(L)

转换为:
 do  %% Line 14call 'erlang':'iolist_size'
([72|[101|[[108|[#{#<108>(8,1,'integer',['unsigned'|['big']]),
#<111>(8,1,'integer',['unsigned'|['big']])}#|[[32]]]]|[[[[[87]|[[111]]]|[#{#<114>(8,1,'integer',['unsigned'|['big']]),
#<108>(8,1,'integer',['unsigned'|['big']])}#]]]|[#{#<100>(8,1,'integer',['unsigned'|['big']])}#]]]]])

相关阅读

Stackoverflow上有人提到了同样的问题:

Ports, external or linked-in, accept something called io-lists for sending data to them. An io-list is a binary or a (possibly deep) list of binaries or integers in the range 0..255.
This means that rather than concatenating two lists before sending them to a port, one can just send them as two items in a list. So instead of
"foo" ++ "bar"
one do
["foo", "bar"]
In this example it is of course of miniscule difference. But the iolist in itself allows for convenient programming when creating output data. io_lib:format/2,3 itself returns an io list for example.
The function erlang:list_to_binary/1 accepts io lists, but now we have erlang:iolist_to_binary/1 which convey the intention better. There is also an erlang:iolist_size/1.
Best of all, since files and sockets are implemented as ports, you can send iolists to them. No need to flatten or append.

还有这一篇:A Ramble Through Erlang IO Lists http://prog21.dadgum.com/70.html

The IO List is a handy data type in Erlang, but not one that's often discussed in tutorials. It's any binary. Or any list containing integers between 0 and 255. Or any arbitrarily nested list containing either of those two things. Like this:

[10, 20, "hello", <<"hello",65>>, [<<1,2,3>>, 0, 255]]

The key to IO lists is that you never flatten them. They get passed directly into low-level runtime functions (such as file:write_file), and the flattening happens without eating up any space in your Erlang process. Take advantage of that! Instead of appending values to lists, use nesting instead. For example, here's a function to put a string in quotes:

quote(String) -> $" ++ String ++ $".

If you're working with IO lists, you can avoid the append operations completely (and the second "++" above results in an entirely new version of String being created). This version uses nesting instead:

quote(String) -> [$", String, $"].

This creates three list elements no matter how long the initial string is. The first version creates length(String) + 2 elements. It's also easy to go backward and un-quote the string: just take the second list element. Once you get used to nesting you can avoid most append operations completely.

One thing that nested list trick is handy for is manipulating filenames. Want to add a directory name and ".png" extension to a filename? Just do this:

[Directory, $/, Filename, ".png"]

Unfortunately, filenames in the file module are not true IO lists. You can pass in deep lists, but they get flattened by an Erlang function (file:file_name/1), not the runtime system. That means you can still dodge appending lists in your own code, but things aren't as efficient behind the scenes as they could be. And "deep lists" in this case meansonly lists, not binaries. Strangely, these deep lists can also contain atoms, which get expanded via atom_to_list.

Ideally filenames would be IO lists, but for compatibility reasons there's still the need to support atoms in filenames. That brings up an interesting idea: why not allow atoms as part of the general IO list specification? It makes sense, as the runtime system has access to the atom table, and there's a simple correspondence between an atom and how it gets encoded in a binary; 'atom' is treated the same as "atom". I find I'm often calling atom_to_list before sending data to external ports, and that would no longer be necessary.

总结
iolist是单次赋值约束下,避免了字符串和二进制数据的转换,是动态构建字符串内容输出的好方法;

[Erlang 0034] Erlang iolist相关推荐

  1. [Erlang危机]Erlang In Danger 序言(必读)

    原创文章,转载请注明出处:服务器非业余研究http://blog.csdn.net/erlib 作者Sunface 联系邮箱:cto@188.com 这本新书是Learn You Some Erlan ...

  2. [Erlang 0057] Erlang 排错利器: Erlang Crash Dump Viewer

    Erlang Crash Dump Viewer真的是排错的天兵神器,还记得我们之前曾经讨论过[Erlang 0013]抓取Erlang进程运行时信息 [Erlang 0012]Erlang Proc ...

  3. [Erlang 0014]Erlang垃圾回收机制

    前面的Erlang杂记中我们简单提到过Erlang的垃圾回收机制:1.以进程为单位进行垃圾回收 2.ETS和原子不参与垃圾回收.今天我们继续这一话题,关注更多关于细节. 在Erlang的官方文档中,关 ...

  4. [Erlang 0111] Erlang Abstract Format , Part 2

    上回书,我们说到飞天玉虎蒋伯芳来到蜈蚣岭,不是,重来,上回咱们说到可以在Erlang Shell里面手工构造,加载并调用一个模块.在那个demo里面,我把多个Form单独生成出来,最后放在一起做com ...

  5. 2011年3月华章新书书讯:ASP.NET本质论、Erlang编程指南、SNS网站构建

    ASP.NET本质论 深入剖析ASP.NET的运行机制和工作原理,带你领略ASP.NET的本质和精髓 包含大量开发技巧和最佳实践,为开发稳定而高效的ASP.NET应用提供绝佳指导 SNS网站构建 提供 ...

  6. erlang虚拟机精要(1)-运行时系统简介

    Erlang运行时系统应用程序ERTS包含运行Erlang系统所需的功能. 请注意 默认情况下,ERTS只保证与来自同一版本的其他Erlang/OTP组件兼容. Erlang通信 Erlang中的通信 ...

  7. [Erlang危机](5.1.1)内存

    原创文章,转载请注明出处:server非业余研究http://blog.csdn.net/erlib 作者Sunface 联系邮箱:cto@188.com Memory The memory repo ...

  8. Erlang 进程创建性能测试

    测试代码来自 Progremming Erlang. Erlang: R13B (erts-5.7.1), 启动参数 +P 5000000 系统: Window XP CPU: E8200 2.66G ...

  9. java中将字符串顺序反传转_如何在Java中将字符串序列化的Erlang术语反序列化为JInterface对象?...

    我的接口系统提供了来自Erlang世界的结果,该结果发送了erlang术语的字符串表示形式,例如元组列表: [ {"key1" , ["AAA","B ...

最新文章

  1. pytorch 加载模型 模型大小测试速度
  2. PHP-开发环境搭建
  3. c语言课设学生管理程序,c语言程序课程设计学生成绩管理程序.doc
  4. php 删除文件时间,php删除文件后重建,文件创建时间(filectime)未变化怎么解决??...
  5. java中常用的包 类和接口_java.util包常用的类和接口
  6. 【语法】NSMutableArray可变数组
  7. Netty的并发编程实践3:CAS指令和原子类
  8. linux把test目录打包,linux复制、压缩打包、解压缩等操作
  9. IAR8.3安装步骤
  10. 【学习笔记】matlab进行数字信号处理(三)数字滤波技术
  11. 高中信息怎样用计算机求和,高中信息技术excel数据求和课件.ppt
  12. 【CSS】关键字 -webkit-fill-available 详解
  13. 教你如何谈朋友噢!!!zz
  14. 计算平均成绩 (10分)
  15. Word文档Aspose.Words使用教程:构建适用于Android的Word转PDF应用程序
  16. 远程linux云主机,Linux实验室 远程连接Linux云主机方法
  17. 一个或多个页边距被设置到可打印区域之外,是否继续?
  18. Debian10修改静态ip
  19. 在excel中使用插值法补全数据
  20. 什么是AR增强现实技术

热门文章

  1. python好学吗 老程序员-为什么会有程序员不喜欢 Python?
  2. 最强的浏览器插件——油猴脚本
  3. java课设电子门禁_Door门禁系统.doc
  4. java actor_Akka笔记之Actor简介
  5. javascript高级程序设计之变量、作用域和内存问题
  6. LeetCode K-diff Pairs in an Array
  7. 题目1189:还是约瑟夫环
  8. redis为什么使用单线程 ,还那么快,单线程是怎么实现的
  9. 第七天总结:字符编码
  10. MongoDB GridFS 存储文件