一般来说,对于一个请求,服务器都会对其进行解析,以确定请求的合法性以及行进的路径。于是本节将讲解如何获取请求的数据。(转载请指明出于breaksoftware的csdn博客)

我们使用《服务器架设笔记——编译Apache及其插件》一文中的方法创建一个Handler工程——get_request。该工程中,我们可以操作的入口函数是

static int get_request_handler(request_rec *r)
{r->content_type = "text/html";  

通过该入口函数,我们可以直接得到的数据就是request_rec结构体对象指针r。通过查阅源码,我们得到其定义

/*** @brief A structure that represents the current request*/
struct request_rec {/** The pool associated with the request */apr_pool_t *pool;/** The connection to the client */conn_rec *connection;/** The virtual host for this request */server_rec *server;/** Pointer to the redirected request if this is an external redirect */request_rec *next;/** Pointer to the previous request if this is an internal redirect */request_rec *prev;/** Pointer to the main request if this is a sub-request* (see http_request.h) */request_rec *main;/* Info about the request itself... we begin with stuff that only* protocol.c should ever touch...*//** First line of request */char *the_request;/** HTTP/0.9, "simple" request (e.g. GET /foo\n w/no headers) */int assbackwards;/** A proxy request (calculated during post_read_request/translate_name)*  possible values PROXYREQ_NONE, PROXYREQ_PROXY, PROXYREQ_REVERSE,*                  PROXYREQ_RESPONSE*/int proxyreq;/** HEAD request, as opposed to GET */int header_only;/** Protocol version number of protocol; 1.1 = 1001 */int proto_num;/** Protocol string, as given to us, or HTTP/0.9 */char *protocol;/** Host, as set by full URI or Host: */const char *hostname;/** Time when the request started */apr_time_t request_time;/** Status line, if set by script */const char *status_line;/** Status line */int status;/* Request method, two ways; also, protocol, etc..  Outside of protocol.c,* look, but don't touch.*//** M_GET, M_POST, etc. */int method_number;/** Request method (eg. GET, HEAD, POST, etc.) */const char *method;/***  'allowed' is a bitvector of the allowed methods.**  A handler must ensure that the request method is one that*  it is capable of handling.  Generally modules should DECLINE*  any request methods they do not handle.  Prior to aborting the*  handler like this the handler should set r->allowed to the list*  of methods that it is willing to handle.  This bitvector is used*  to construct the "Allow:" header required for OPTIONS requests,*  and HTTP_METHOD_NOT_ALLOWED and HTTP_NOT_IMPLEMENTED status codes.**  Since the default_handler deals with OPTIONS, all modules can*  usually decline to deal with OPTIONS.  TRACE is always allowed,*  modules don't need to set it explicitly.**  Since the default_handler will always handle a GET, a*  module which does *not* implement GET should probably return*  HTTP_METHOD_NOT_ALLOWED.  Unfortunately this means that a Script GET*  handler can't be installed by mod_actions.*/apr_int64_t allowed;/** Array of extension methods */apr_array_header_t *allowed_xmethods;/** List of allowed methods */ap_method_list_t *allowed_methods;/** byte count in stream is for body */apr_off_t sent_bodyct;/** body byte count, for easy access */apr_off_t bytes_sent;/** Last modified time of the requested resource */apr_time_t mtime;/* HTTP/1.1 connection-level features *//** The Range: header */const char *range;/** The "real" content length */apr_off_t clength;/** sending chunked transfer-coding */int chunked;/** Method for reading the request body* (eg. REQUEST_CHUNKED_ERROR, REQUEST_NO_BODY,*  REQUEST_CHUNKED_DECHUNK, etc...) */int read_body;/** reading chunked transfer-coding */int read_chunked;/** is client waiting for a 100 response? */unsigned expecting_100;/** The optional kept body of the request. */apr_bucket_brigade *kept_body;/** For ap_body_to_table(): parsed body *//* XXX: ap_body_to_table has been removed. Remove body_table too or* XXX: keep it to reintroduce ap_body_to_table without major bump? */apr_table_t *body_table;/** Remaining bytes left to read from the request body */apr_off_t remaining;/** Number of bytes that have been read  from the request body */apr_off_t read_length;/* MIME header environments, in and out.  Also, an array containing* environment variables to be passed to subprocesses, so people can* write modules to add to that environment.** The difference between headers_out and err_headers_out is that the* latter are printed even on error, and persist across internal redirects* (so the headers printed for ErrorDocument handlers will have them).** The 'notes' apr_table_t is for notes from one module to another, with no* other set purpose in mind...*//** MIME header environment from the request */apr_table_t *headers_in;/** MIME header environment for the response */apr_table_t *headers_out;/** MIME header environment for the response, printed even on errors and* persist across internal redirects */apr_table_t *err_headers_out;/** Array of environment variables to be used for sub processes */apr_table_t *subprocess_env;/** Notes from one module to another */apr_table_t *notes;/* content_type, handler, content_encoding, and all content_languages* MUST be lowercased strings.  They may be pointers to static strings;* they should not be modified in place.*//** The content-type for the current request */const char *content_type;   /* Break these out --- we dispatch on 'em *//** The handler string that we use to call a handler function */const char *handler;        /* What we *really* dispatch on *//** How to encode the data */const char *content_encoding;/** Array of strings representing the content languages */apr_array_header_t *content_languages;/** variant list validator (if negotiated) */char *vlist_validator;/** If an authentication check was made, this gets set to the user name. */char *user;/** If an authentication check was made, this gets set to the auth type. */char *ap_auth_type;/* What object is being requested (either directly, or via include* or content-negotiation mapping).*//** The URI without any parsing performed */char *unparsed_uri;/** The path portion of the URI, or "/" if no path provided */char *uri;/** The filename on disk corresponding to this response */char *filename;/* XXX: What does this mean? Please define "canonicalize" -aaron *//** The true filename, we canonicalize r->filename if these don't match */char *canonical_filename;/** The PATH_INFO extracted from this request */char *path_info;/** The QUERY_ARGS extracted from this request */char *args;/*** Flag for the handler to accept or reject path_info on* the current request.  All modules should respect the* AP_REQ_ACCEPT_PATH_INFO and AP_REQ_REJECT_PATH_INFO* values, while AP_REQ_DEFAULT_PATH_INFO indicates they* may follow existing conventions.  This is set to the* user's preference upon HOOK_VERY_FIRST of the fixups.*/int used_path_info;/** A flag to determine if the eos bucket has been sent yet */int eos_sent;/* Various other config info which may change with .htaccess files* These are config vectors, with one void* pointer for each module* (the thing pointed to being the module's business).*//** Options set in config files, etc. */struct ap_conf_vector_t *per_dir_config;/** Notes on *this* request */struct ap_conf_vector_t *request_config;/** Optional request log level configuration. Will usually point*  to a server or per_dir config, i.e. must be copied before*  modifying */const struct ap_logconf *log;/** Id to identify request in access and error log. Set when the first*  error log entry for this request is generated.*/const char *log_id;/*** A linked list of the .htaccess configuration directives* accessed by this request.* N.B. always add to the head of the list, _never_ to the end.* that way, a sub request's list can (temporarily) point to a parent's list*/const struct htaccess_result *htaccess;/** A list of output filters to be used for this request */struct ap_filter_t *output_filters;/** A list of input filters to be used for this request */struct ap_filter_t *input_filters;/** A list of protocol level output filters to be used for this*  request */struct ap_filter_t *proto_output_filters;/** A list of protocol level input filters to be used for this*  request */struct ap_filter_t *proto_input_filters;/** This response can not be cached */int no_cache;/** There is no local copy of this response */int no_local_copy;/** Mutex protect callbacks registered with ap_mpm_register_timed_callback* from being run before the original handler finishes running*/apr_thread_mutex_t *invoke_mtx;/** A struct containing the components of URI */apr_uri_t parsed_uri;/**  finfo.protection (st_mode) set to zero if no such file */apr_finfo_t finfo;/** remote address information from conn_rec, can be overridden if* necessary by a module.* This is the address that originated the request.*/apr_sockaddr_t *useragent_addr;char *useragent_ip;/** MIME trailer environment from the request */apr_table_t *trailers_in;/** MIME trailer environment from the response */apr_table_t *trailers_out;
};

这是个非常大的结构体,可谓是包罗万象。对于初学者来说,想完全弄明白各项是什么还是比较困难的。而我们的需求很简单,我们就列出我们可能需要关心的数据

    /** First line of request */char *the_request;

请求的第一行数据

    /** Protocol version number of protocol; 1.1 = 1001 */int proto_num;/** Protocol string, as given to us, or HTTP/0.9 */char *protocol;/** Host, as set by full URI or Host: */const char *hostname;

协议的版本和请求的类型

    /** Time when the request started */apr_time_t request_time;

请求的时间

    /** The URI without any parsing performed */char *unparsed_uri;/** The path portion of the URI, or "/" if no path provided */char *uri;/** The filename on disk corresponding to this response */char *filename;

未进行urldecode的URI、经过urldecode的URI和处理该请求的文件路径

    /** The PATH_INFO extracted from this request */char *path_info;/** The QUERY_ARGS extracted from this request */char *args;

请求中的路径和参数

    /** A struct containing the components of URI */apr_uri_t parsed_uri;

请求解析的详细结果

    char *useragent_ip;

请求来源的IP

/** MIME header environment from the request */apr_table_t *headers_in;

以table形式保存的http头信息

对于基础数据类型我们很容易编写出例程

 if (r->the_request) {ap_rprintf(r, "the request : %s\n", r->the_request);}else {ap_rprintf(r, "the request is NULL\n");}if (r->protocol) {ap_rprintf(r, "protocol : %s\n", r->protocol);}else {ap_rprintf(r, "protocol is NULL\n");}ap_rprintf(r, "proto_num is %d\n", r->proto_num);

而对于请求时间apr_time_t类型,我们可以参考《服务器架设笔记——Apache模块开发基础知识》中对模块的介绍。我们查看源码,可以编写出如下例程

static void print_time(request_rec* r) {if (!r) {ap_rprintf(r, "request_rec pointer is NULL\n");return;}char data_str[128] = {0};apr_status_t status = apr_ctime(data_str, r->request_time);if (APR_SUCCESS != status) {ap_rprintf(r, "apr_ctime error\n");    }else {ap_rprintf(r, "ctime\t:\t%s\n", data_str);}apr_time_exp_t exp_t;memset(&exp_t, 0, sizeof(exp_t));status = apr_time_exp_gmt(&exp_t, r->request_time);if (APR_SUCCESS != status) {ap_rprintf(r, "apr_time_exp_gmt error\n");}else {ap_rprintf(r, "exp time\t:\n");ap_rprintf(r, "\ttm_usec\t:\t%d\n", exp_t.tm_usec);ap_rprintf(r, "\ttm_sec\t:\t%d\n", exp_t.tm_sec);ap_rprintf(r, "\ttm_min\t:\t%d\n", exp_t.tm_min);ap_rprintf(r, "\ttm_hour\t:\t%d\n", exp_t.tm_hour);ap_rprintf(r, "\ttm_mday\t:\t%d\n", exp_t.tm_mday);ap_rprintf(r, "\ttm_mon\t:\t%d\n", exp_t.tm_mon);ap_rprintf(r, "\ttm_year\t:\t%d\n", exp_t.tm_year);ap_rprintf(r, "\ttm_wday\t:\t%d\n", exp_t.tm_wday);ap_rprintf(r, "\ttm_yday\t:\t%d\n", exp_t.tm_yday);ap_rprintf(r, "\ttm_isdst\t:\t%d\n", exp_t.tm_isdst);ap_rprintf(r, "\ttm_gmtoff\t:\t%d\n", exp_t.tm_gmtoff);}
}

其中apr_time_exp_t的定义在《apr_time.h》中。

/*** a structure similar to ANSI struct tm with the following differences:*  - tm_usec isn't an ANSI field*  - tm_gmtoff isn't an ANSI field (it's a BSDism)*/
struct apr_time_exp_t {/** microseconds past tm_sec */apr_int32_t tm_usec;/** (0-61) seconds past tm_min */apr_int32_t tm_sec;/** (0-59) minutes past tm_hour */apr_int32_t tm_min;/** (0-23) hours past midnight */apr_int32_t tm_hour;/** (1-31) day of the month */apr_int32_t tm_mday;/** (0-11) month of the year */apr_int32_t tm_mon;/** year since 1900 */apr_int32_t tm_year;/** (0-6) days since Sunday */apr_int32_t tm_wday;/** (0-365) days since January 1 */apr_int32_t tm_yday;/** daylight saving time */apr_int32_t tm_isdst;/** seconds east of UTC */apr_int32_t tm_gmtoff;
};

对于已分析过了的请求结构体apr_uri_t的例程也非常简单,我就不再列出来,只是把其结构体定义贴一下。大家一看就明白

/*** A structure to encompass all of the fields in a uri*/
struct apr_uri_t {/** scheme ("http"/"ftp"/...) */char *scheme;/** combined [user[:password]\@]host[:port] */char *hostinfo;/** user name, as in http://user:passwd\@host:port/ */char *user;/** password, as in http://user:passwd\@host:port/ */char *password;/** hostname from URI (or from Host: header) */char *hostname;/** port string (integer representation is in "port") */char *port_str;/** the request path (or NULL if only scheme://host was given) */char *path;/** Everything after a '?' in the path, if present */char *query;/** Trailing "#fragment" string, if present */char *fragment;/** structure returned from gethostbyname() */struct hostent *hostent;/** The port number, numeric, valid only if port_str != NULL */apr_port_t port;/** has the structure been initialized */unsigned is_initialized:1;/** has the DNS been looked up yet */unsigned dns_looked_up:1;/** has the dns been resolved yet */unsigned dns_resolved:1;
};

这些例程中麻烦的是对apr_table_t的解析。因为网上很难找到对该table的遍历代码,于是我只能参考apr_table_clone中代码得出如下

static void print_table(request_rec *r, const apr_table_t* t) {const apr_array_header_t* array = apr_table_elts(t);apr_table_entry_t* elts = (apr_table_entry_t*)array->elts;for (int i = 0; i < array->nelts; i++) {ap_rprintf(r, "\t%s : %s\n", elts[i].key, elts[i].val);}
}

我们请求一个URL:http://192.168.191.129/AP%26AC%3aHE?a=b#c

其返回如下

headers_in startHost : 192.168.191.129Connection : keep-aliveCache-Control : max-age=0Accept : text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8User-Agent : Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36Accept-Encoding : gzip,deflate,sdchAccept-Language : zh-CN,zh;q=0.8
headers_in endheaders_out start
headers_out endthe request : GET /AP%26AC%3aHE?a=b HTTP/1.1
protocol : HTTP/1.1
proto_num is 1001
method : GET
host name : 192.168.191.129
unparsed uri : /AP%26AC%3aHE?a=b
uri : /AP&AC:HE
filename : /usr/local/apache2/htdocs/AP&AC:HE
path info :
args : a=b
user is NULL
log id is NULL
useragent ip : 192.168.191.1
ctime   :   Mon Feb 16 18:20:39 2015
exp time    :tm_usec    :   200039tm_sec    :   39tm_min    :   20tm_hour   :   10tm_mday   :   16tm_mon    :   1tm_year    :   115tm_wday  :   1tm_yday    :   46tm_isdst  :   0tm_gmtoff  :   0
scheme is NULL
hostinfo is NULL
user is NULL
password is NULL
hostname is NULL
port_str is NULL
path : /AP&AC:HE
query : a=b
fragment is NULL
The sample page from mod_get_request.c

服务器架设笔记——使用Apache插件解析简单请求相关推荐

  1. 服务器架设笔记——编译Apache及其插件

    之前一直从事Windows上的客户端软件开发,经常会处理和服务器交互相关的业务.由于希望成为一个全栈式的工程师,我对Linux上服务器相关的开发也越来越感兴趣.趁着年底自由的时间比较多,我可以对这块做 ...

  2. RHCE课程-RH253Linux服务器架设笔记五-APACHE服务器配置(4)

    JSP(Java Server Pages)是由Sun Microsystems公司倡导.许多公司一起参与建立的一种基于Java技术的动态网页技术标准. Apache只是一个Web服务器,不能运行JS ...

  3. 服务器架设笔记——打通MySQL和Apache

    在<服务器架设笔记--使用Apache插件解析简单请求>一文中,我们已经可以获取请求内容.这只是万里长征的第一步.因为一般来说,客户端向服务器发起请求,服务器会有着复杂的业务处理逻辑.举个 ...

  4. 服务器架设笔记——Apache模块开发基础知识

    通过上节的例子,我们发现Apache插件开发的一个门槛便是学习它自成体系的一套API.虽然Apache的官网上有对这些API的详细介绍,但是空拿着一些零散的说明书,是很难快速建立起一套可以运行的系统. ...

  5. 服务器架设笔记——多模块和全局数据

    随着项目工程的发展,多模块设计和性能优化是在所难免的.本文我将基于一些现实中可能遇到的需求,讲解如何在Apache的Httpd插件体系中实现这些功能.(转载请指明出于breaksoftware的csd ...

  6. CentOS4.4下邮件服务器架设笔记之windows AD整合功能实现

    1.通过"CentOS4.4下邮件服务器架设笔记之邮件网关功能实现"这一篇文章,我们已经实现了邮件网关功能,但是对于microsoft ad 平台下exchange邮件系统用户来说 ...

  7. RHCE课程-RH253Linux服务器架设笔记五-DNS服务器配置(2)

    上季我们学习了,DNS的原理和bind软件的相关简介,还有安装架设了BIND软件的DNS服务器,还有就是正向区域和反向区域的一些技巧,今天我们要学的就是DNS的辅助服务器的架设,还有DNS的缓存域名服 ...

  8. Web服务器踩坑之旅03:解析HTTP请求报文

    项目地址: 本文实现的文件在源码中的SimpleWebServer/http_parser目录下 本文内容 目标:解析HTTP报文,从而获取客户请求的文件的文件名及文件地址 浏览器与服务器间的通信过程 ...

  9. 服务器架设笔记——httpd插件支持mysql字符集选择

    mysql数据库默认的字符集是latin1.默认情况下,我们编译的httpd插件是可以正常读取该类型的数据库,并且不会出现乱码.但是,如果我们的数据库变成其他格式,比如UTF8,那么默认读取出来的数据 ...

最新文章

  1. 规划以主机命名的网站集 (Windows SharePoint Services)
  2. stdarg.h的库函数用法小结
  3. java todo error_java基础-异常
  4. 【python图像处理】直线和曲线的拟合与绘制(curve_fit()详解)
  5. AutoLisp:AutoLisp实现对AutoCAD进行输入命令,选择对象,实现不同方向进行缩放
  6. MySQL语句的语法
  7. Verifying Checksum ... Bad Data CRC ERROR: can#39;t get kernel image!
  8. 给oracle数据库某一列数据的前面或者后面增加字符,合并两列的值
  9. 打印机服务器启用后自动关闭,打印机print spooler服务启动后总是自动停止的解决方法(没测试)...
  10. 基于vue的个人博客
  11. mysql对韵母分组,基于MySQL的中文发音查询的元级实现
  12. 谷歌浏览器注入的样式表 (更改了我网页的样式)
  13. http常见状态返回码
  14. 软件开发生命周期有几个阶段?
  15. RPC(1)HttpClient详细使用 含demo
  16. 整数与浮点数比较-汇编码分析
  17. QQ如何设置使用代理服务器?
  18. MATLAB实现三种基本排序(选择+冒泡+插入)
  19. 团购网如何成功实现平台化转型
  20. 主流电商平台API接口大全

热门文章

  1. PCA(3):PCA实现C++代码
  2. 奇葩错误 WIFI搜不到、无线网卡连接不上
  3. 深度学习--TensorFlow(7)拟合(过拟合处理)(数据增强、提前停止训练、dropout、正则化、标签平滑)
  4. mysql替换开头_如何在MySQL的字符串开头搜索和替换特定字符?
  5. mysql数据定义语句有哪些_MySQL语法一:数据定义语句 钓鱼翁
  6. Open3DGen:从RGB-D图像重建纹理3D模型的开源软件
  7. Python如何在以数字命名的文件前补0
  8. Udacity机器人软件工程师课程笔记(二十五) - 使用PID控制四轴飞行器 - 四轴飞行器(四旋翼)模拟器
  9. 3G重选至4G--基于优先级
  10. Rocksdb 的 MergeOperator 简单使用记录