Radeon Gpu源码分析
1.什么是Radeon
简介:
Radeon(中文名称称为镭龙™)是一个英文产品的商标。Radeon是AMD公司出品的显示芯片的一种系列。俗称A卡。
全称一般写作:AMD Radeon HD xxxx ,例如台式机显卡型号:AMD Radeon HD 6450。HD7000系列以后,AMD启用新的AMD Radeon R9/R7 xxx命名旗下新的显示芯片。
出产型号AMD Radeon系列的台式机显卡有:AMD Radeon RX 6000系列显卡AMD Radeon RX 5000系列显卡AMD Radeon VII 显卡AMD Radeon RX 500系列显卡和Radeon RX VegaAMD Radeon RX 400和Radeon 400系列显卡 [1] AMD Radeon R9/R7 300 系列/FURY系列显卡 [2] AMD Radeon R9/R7 200 系列显卡AMDRadeon HD 8000 系列显卡(OEM专供)AMD Radeon HD 7000 系列显卡AMD Radeon HD 6000 系列显卡ATI Radeon HD 5000 系列显卡ATI Radeon HD 4000 系列显卡ATI Radeon HD 3000 系列显卡ATI Radeon HD 2000 系列显卡ATI Radeon X 1000 系列显卡面向Mac®的ATI Radeon™还有面向专业用户的 AMD Fire Pro™
2.什么是DRM
Refer: 后面会详细介绍DRM框架
DRM
DRM,全称为Direct Rendering Manager。
- DRM是Linux目前主流的图形显示框架。相比较FB架构,DRM更能适应当前日益更新的显示硬件。
- 比如FB原生不支持多层合成、VSYNC、DMA-BUF、异步更新、fence机制等等。DRM原生全部支持。
- 同时DRM可以统一管理GPU和Display驱动,使软件架构更为统一,方便管理和维护。
- DRM是一个内核级的设备驱动,既可以编译到内核中/也可以作为标准模块进行加载。
- DRM最初实在FreeBSD中出现的,后来被移植到Linux系统中,成为Linux系统的标准部分。
DRM从模块上划分,可以简单分为三个部分: LIBDRM、KMS、GEM
LIBDRM
对底层接口进行封装,向上层提供通用的API接口,主要是对各种IOCTL接口进行封装。
KMS
KMS Kernel Mode Setting -----Mode Setting:更新画面和设置显示参数
更新画面: 显示buffer的切换,多图层的合成方式以及每个图层的显示位置
设置显示参数:分辨率、刷新率、电源状态(休眠唤醒)等
包含元素:
FB (帧缓冲/帧缓存)
对计算机来说,FrameBuffer就是一块驱动和应用层都能访问的内存,是唯一一个和硬件无关的基本元素。 当然画图之前要有一定的格式化,比如说我可以规定什么样的色彩模式(RGB24/l420/YUUV等)以及分辨率的大小等
CRTC (绘图现场)
简写翻译就是阴极摄像管上下文,在DRM里CRTC表示显示输出的上下文。
- 对显示buffer进行扫描,并产生时序信号的硬件模块,通常指Display Controller
- CRTC内指一个FrameBuffer地址,外连一个Encoder。 FrameBuffer — CRTC — Encoder
他们之间如何沟通。这就是显示模式(Mode Setting)需要做的事情。
ModeSetting包括了像前面提到的色彩模式,还有说显示的时序(Timings/ModeLines等都代表了这个意思等),通常时序可以按照一下来表达:
名称 含义 PCLK 像素时钟 HFP 水平前回扫 HBP 水平后回扫 HSW 水平同步头 X_RES 水平有效长度 VFP 垂直前回扫 VBP 垂直后回扫 VSW 垂直同步头 Y_RES 垂直有效长度 - 一个CRTC可以连接多个Encoder,实现复制屏幕功能。
Encoder (输出转换器)
- 负责将CRTC输出的Timing时序转换成外部设备所需要的信号的模块,如HDMI转换器或DSI Controller
- 显卡可以连接各种不同的设备,显然输出需要不同的信号转换器,将内存的像素转换成显示器需要的信号(DVID、VGA、YPbPr、CVBS等等)
Connector (连接器)
- 连接物理显示设备的连接器,如HDMI、DisplayPort、DSI总线,通常和Encoder驱动绑定在一起
- 回到DRM,这是一个抽象的数据结构,代表连接的显示设备,从这里我们可以得到设备的EDID、DPMS连接状态等。
PLANE (硬件图层)
- 有的DisPlay硬件支持多层合成显示,但所有的DisplayController至少要有1个plane
6.VBLANK (垂直消隐、场消隐)
- 软件和硬件的同步机制,RGB时序中的垂直消影区,软件通常使用硬件VSYNC来实现
PROPERTY (属性)
- 任何你想设置的参数,都可以做成proprety,是DRM驱动中最灵活、最方便的Mode Setting机制
- GEM
- Graphic Execution Manager,主要负责显示buffer的分配和释放
- 也是GPU唯一用到DRM的地方
- 包含元素: DUMB、PRIME、FENCE
- DUMB: 只支持连续物理内存,基于Kernel中通用CMA API实现,多用于小分辨率简单场景
- PRIME: 连续、非连续物理内存都支持,基于DMA-BUF机制,可以实现buffer共享,多用于大内存复杂场景
- FENCE: buffer同步机制,基于内核dma_fence机制实现,用于防止显示内容出现异步问题
- DUMB: 只支持连续物理内存,基于Kernel中通用CMA API实现,多用于小分辨率简单场景
3.Radeon初始化
解析DRM代码,以从底层介绍显卡驱动的初始化,显卡类型是AMD的radeon r600以后的系列显卡。
基本过程就是驱动载入、硬件初始化、设置硬件独立的模块(如内存管理器)、设置显示(分辨率等)
代码path:
/home/fakechen/work/klinux-4.19/drivers/gpu/drm/radeon/radeon_drv.c
module_init(radeon_init); module_exit(radeon_exit);MODULE_AUTHOR(DRIVER_AUTHOR); MODULE_DESCRIPTION(DRIVER_DESC); MODULE_LICENSE("GPL and additional rights");
static int __init radeon_init(void) {if (vgacon_text_force() && radeon_modeset == -1) {DRM_INFO("VGACON disable radeon kernel modesetting.\n");radeon_modeset = 0;}/* set to modesetting by default if not nomodeset */if (radeon_modeset == -1)radeon_modeset = 1;if (radeon_modeset == 1) {DRM_INFO("radeon kernel modesetting enabled.\n");driver = &kms_driver;pdriver = &radeon_kms_pci_driver;driver->driver_features |= DRIVER_MODESET;driver->num_ioctls = radeon_max_kms_ioctl;radeon_register_atpx_handler();} else {DRM_ERROR("No UMS support in radeon module!\n");return -EINVAL;}return pci_register_driver(pdriver); }static void __exit radeon_exit(void) {pci_unregister_driver(pdriver);radeon_unregister_atpx_handler(); }
radeon_init
函数中调用pci_register_driver(pdriver)注册pdriver,pdriver实际指向了radeon_kms_pci_driver
static struct drm_driver *driver; static struct pci_driver *pdriver;
static struct pci_driver radeon_kms_pci_driver = {.name = DRIVER_NAME,.id_table = pciidlist,.probe = radeon_pci_probe,.remove = radeon_pci_remove,.shutdown = radeon_pci_shutdown,.driver.pm = &radeon_pm_ops, };
驱动和设备匹配之后调用
radeon_pci_probe
,源码如下:static int radeon_pci_probe(struct pci_dev *pdev,const struct pci_device_id *ent) {unsigned long flags = 0;int ret;if (!ent)return -ENODEV; /* Avoid NULL-ptr deref in drm_get_pci_dev */flags = ent->driver_data;if (!radeon_si_support) {switch (flags & RADEON_FAMILY_MASK) {case CHIP_TAHITI:case CHIP_PITCAIRN:case CHIP_VERDE:case CHIP_OLAND:case CHIP_HAINAN:dev_info(&pdev->dev,"SI support disabled by module param\n");return -ENODEV;}}if (!radeon_cik_support) {switch (flags & RADEON_FAMILY_MASK) {case CHIP_KAVERI:case CHIP_BONAIRE:case CHIP_HAWAII:case CHIP_KABINI:case CHIP_MULLINS:dev_info(&pdev->dev,"CIK support disabled by module param\n");return -ENODEV;}}if (vga_switcheroo_client_probe_defer(pdev))return -EPROBE_DEFER;/* Get rid of things like offb */ret = radeon_kick_out_firmware_fb(pdev);if (ret)return ret;return drm_get_pci_dev(pdev, ent, &kms_driver); }
radeon_pci_probe
函数最后调用drm_get_pci_dev(pdev, ent, &kms_driver)
,kms_driver的初始化内容如下:int drm_get_pci_dev(struct pci_dev *pdev, const struct pci_device_id *ent,struct drm_driver *driver)
static struct drm_driver kms_driver = {.driver_features =DRIVER_USE_AGP |DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM |DRIVER_PRIME | DRIVER_RENDER,.load = radeon_driver_load_kms,.open = radeon_driver_open_kms,.postclose = radeon_driver_postclose_kms,.lastclose = radeon_driver_lastclose_kms,.unload = radeon_driver_unload_kms,.get_vblank_counter = radeon_get_vblank_counter_kms,.enable_vblank = radeon_enable_vblank_kms,.disable_vblank = radeon_disable_vblank_kms,.get_vblank_timestamp = drm_calc_vbltimestamp_from_scanoutpos,.get_scanout_position = radeon_get_crtc_scanout_position,.irq_preinstall = radeon_driver_irq_preinstall_kms,.irq_postinstall = radeon_driver_irq_postinstall_kms,.irq_uninstall = radeon_driver_irq_uninstall_kms,.irq_handler = radeon_driver_irq_handler_kms,.ioctls = radeon_ioctls_kms,.gem_free_object_unlocked = radeon_gem_object_free,.gem_open_object = radeon_gem_object_open,.gem_close_object = radeon_gem_object_close,.dumb_create = radeon_mode_dumb_create,.dumb_map_offset = radeon_mode_dumb_mmap,.fops = &radeon_driver_kms_fops,.prime_handle_to_fd = drm_gem_prime_handle_to_fd,.prime_fd_to_handle = drm_gem_prime_fd_to_handle,.gem_prime_export = radeon_gem_prime_export,.gem_prime_import = drm_gem_prime_import,.gem_prime_pin = radeon_gem_prime_pin,.gem_prime_unpin = radeon_gem_prime_unpin,.gem_prime_res_obj = radeon_gem_prime_res_obj,.gem_prime_get_sg_table = radeon_gem_prime_get_sg_table,.gem_prime_import_sg_table = radeon_gem_prime_import_sg_table,.gem_prime_vmap = radeon_gem_prime_vmap,.gem_prime_vunmap = radeon_gem_prime_vunmap,.name = DRIVER_NAME,.desc = DRIVER_DESC,.date = DRIVER_DATE,.major = KMS_DRIVER_MAJOR,.minor = KMS_DRIVER_MINOR,.patchlevel = KMS_DRIVER_PATCHLEVEL, };
drm_get_pci_dev函数代码如下:
- drivers/gpu/drm/drm_pci.c
/*** drm_get_pci_dev - Register a PCI device with the DRM subsystem* @pdev: PCI device* @ent: entry from the PCI ID table that matches @pdev* @driver: DRM device driver** Attempt to gets inter module "drm" information. If we are first* then register the character device and inter module information.* Try and register, if we fail to register, backout previous work.** NOTE: This function is deprecated, please use drm_dev_alloc() and* drm_dev_register() instead and remove your &drm_driver.load callback.** Return: 0 on success or a negative error code on failure.*/ int drm_get_pci_dev(struct pci_dev *pdev, const struct pci_device_id *ent,struct drm_driver *driver) {struct drm_device *dev;int ret;DRM_DEBUG("\n");dev = drm_dev_alloc(driver, &pdev->dev);if (IS_ERR(dev))return PTR_ERR(dev);ret = pci_enable_device(pdev);if (ret)goto err_free;dev->pdev = pdev; #ifdef __alpha__dev->hose = pdev->sysdata; #endifif (drm_core_check_feature(dev, DRIVER_MODESET))pci_set_drvdata(pdev, dev);drm_pci_agp_init(dev);ret = drm_dev_register(dev, ent->driver_data);if (ret)goto err_agp;/* No locking needed since shadow-attach is single-threaded since it may* only be called from the per-driver module init hook. */if (drm_core_check_feature(dev, DRIVER_LEGACY))list_add_tail(&dev->legacy_dev_list, &driver->legacy_dev_list);return 0;err_agp:drm_pci_agp_destroy(dev);pci_disable_device(pdev); err_free:drm_dev_put(dev);return ret; } EXPORT_SYMBOL(drm_get_pci_dev);
drm_get_pci_dev
函数,这是一个比较关键的函数,其中调用了几个函数。- 先说第一个函数:
drm_dev_alloc
- 调用处:
dev = drm_dev_alloc(driver, &pdev->dev);
- 源码如下:drivers/gpu/drm/drm_drv.c
- 先说第一个函数:
/*** drm_dev_alloc - Allocate new DRM device* @driver: DRM driver to allocate device for* @parent: Parent device object** Allocate and initialize a new DRM device. No device registration is done.* Call drm_dev_register() to advertice the device to user space and register it* with other core subsystems. This should be done last in the device* initialization sequence to make sure userspace can't access an inconsistent* state.** The initial ref-count of the object is 1. Use drm_dev_get() and* drm_dev_put() to take and drop further ref-counts.** Note that for purely virtual devices @parent can be NULL.** Drivers that wish to subclass or embed &struct drm_device into their* own struct should look at using drm_dev_init() instead.** RETURNS:* Pointer to new DRM device, or ERR_PTR on failure.*/
struct drm_device *drm_dev_alloc(struct drm_driver *driver,struct device *parent)
{struct drm_device *dev;int ret;dev = kzalloc(sizeof(*dev), GFP_KERNEL);if (!dev)return ERR_PTR(-ENOMEM);ret = drm_dev_init(dev, driver, parent);if (ret) {kfree(dev);return ERR_PTR(ret);}return dev;
}
EXPORT_SYMBOL(drm_dev_alloc);
这个函数的参数struct drm_driver *driver 其实就是指向
kms_driver
函数完成的事情:①先分配了一个struct drm_device结构,并使用dev指向这个结构 ②相当于就是新建了一个drm设备(drm_device)
接下来调用drm_dev_init函数对于这个新建的设备进行初始化。
代码如下:
drivers/gpu/drm/drm_drv.c
/*** drm_dev_init - Initialise new DRM device* @dev: DRM device* @driver: DRM driver* @parent: Parent device object** Initialize a new DRM device. No device registration is done.* Call drm_dev_register() to advertice the device to user space and register it* with other core subsystems. This should be done last in the device* initialization sequence to make sure userspace can't access an inconsistent* state.** The initial ref-count of the object is 1. Use drm_dev_get() and* drm_dev_put() to take and drop further ref-counts.** Note that for purely virtual devices @parent can be NULL.** Drivers that do not want to allocate their own device struct* embedding &struct drm_device can call drm_dev_alloc() instead. For drivers* that do embed &struct drm_device it must be placed first in the overall* structure, and the overall structure must be allocated using kmalloc(): The* drm core's release function unconditionally calls kfree() on the @dev pointer* when the final reference is released. To override this behaviour, and so* allow embedding of the drm_device inside the driver's device struct at an* arbitrary offset, you must supply a &drm_driver.release callback and control* the finalization explicitly.** RETURNS:* 0 on success, or error code on failure.*/ int drm_dev_init(struct drm_device *dev,struct drm_driver *driver,struct device *parent) {int ret;if (!drm_core_init_complete) {DRM_ERROR("DRM core is not initialized\n");return -ENODEV;}kref_init(&dev->ref);dev->dev = get_device(parent);dev->driver = driver;INIT_LIST_HEAD(&dev->filelist);INIT_LIST_HEAD(&dev->filelist_internal);INIT_LIST_HEAD(&dev->clientlist);INIT_LIST_HEAD(&dev->ctxlist);INIT_LIST_HEAD(&dev->vmalist);INIT_LIST_HEAD(&dev->maplist);INIT_LIST_HEAD(&dev->vblank_event_list);spin_lock_init(&dev->buf_lock);spin_lock_init(&dev->event_lock);mutex_init(&dev->struct_mutex);mutex_init(&dev->filelist_mutex);mutex_init(&dev->clientlist_mutex);mutex_init(&dev->ctxlist_mutex);mutex_init(&dev->master_mutex);dev->anon_inode = drm_fs_inode_new();if (IS_ERR(dev->anon_inode)) {ret = PTR_ERR(dev->anon_inode);DRM_ERROR("Cannot allocate anonymous inode: %d\n", ret);goto err_free;}if (drm_core_check_feature(dev, DRIVER_RENDER)) {ret = drm_minor_alloc(dev, DRM_MINOR_RENDER);if (ret)goto err_minors;}ret = drm_minor_alloc(dev, DRM_MINOR_PRIMARY);if (ret)goto err_minors;ret = drm_ht_create(&dev->map_hash, 12);if (ret)goto err_minors;drm_legacy_ctxbitmap_init(dev);if (drm_core_check_feature(dev, DRIVER_GEM)) {ret = drm_gem_init(dev);if (ret) {DRM_ERROR("Cannot initialize graphics execution manager (GEM)\n");goto err_ctxbitmap;}}/* Use the parent device name as DRM device unique identifier, but fall* back to the driver name for virtual devices like vgem. */ret = drm_dev_set_unique(dev, parent ? dev_name(parent) : driver->name);if (ret)goto err_setunique;return 0;err_setunique:if (drm_core_check_feature(dev, DRIVER_GEM))drm_gem_destroy(dev); err_ctxbitmap:drm_legacy_ctxbitmap_cleanup(dev);drm_ht_remove(&dev->map_hash); err_minors:drm_minor_free(dev, DRM_MINOR_PRIMARY);drm_minor_free(dev, DRM_MINOR_RENDER);drm_fs_inode_free(dev->anon_inode); err_free:put_device(dev->dev);mutex_destroy(&dev->master_mutex);mutex_destroy(&dev->ctxlist_mutex);mutex_destroy(&dev->clientlist_mutex);mutex_destroy(&dev->filelist_mutex);mutex_destroy(&dev->struct_mutex);return ret; } EXPORT_SYMBOL(drm_dev_init);
函数说明已经很清楚了,
drm_dev_init
函数初始化了一个新的DRM设备,其实就是给上面2.新分配的struct drm_device *dev赋值传入的参数,被下面两句程序用到:
dev->dev = get_device(parent);dev->driver = driver;
上面我们讲到了
drm_get_pci_dev
函数的第一个函数drm_dev_alloc
在
drm_dev_alloc
函数中,注释里面有这样一段话- Allocate and initialize a new DRM device. No device registration is done.
- Call drm_dev_register() to advertice the device to user space and register it
- with other core subsystems. This should be done last in the device
- initialization sequence to make sure userspace can’t access an inconsistent
- state.
大概的意思是说:初始化了新的DRM设备,但是没有去注册它。需要在初始化后面调用
drm_dev_register()
来注册以便通知到用户空间再看
drm_get_pci_dev
调用的第二个函数drm_dev_register
调用处:
ret = drm_dev_register(dev, ent->driver_data);
源码如下:drivers/gpu/drm/drm_drv.c
/*** drm_dev_register - Register DRM device* @dev: Device to register* @flags: Flags passed to the driver's .load() function** Register the DRM device @dev with the system, advertise device to user-space* and start normal device operation. @dev must be allocated via drm_dev_alloc()* previously.** Never call this twice on any device!** NOTE: To ensure backward compatibility with existing drivers method this* function calls the &drm_driver.load method after registering the device* nodes, creating race conditions. Usage of the &drm_driver.load methods is* therefore deprecated, drivers must perform all initialization before calling* drm_dev_register().** RETURNS:* 0 on success, negative error code on failure.*/ int drm_dev_register(struct drm_device *dev, unsigned long flags) {struct drm_driver *driver = dev->driver;int ret;mutex_lock(&drm_global_mutex);ret = drm_minor_register(dev, DRM_MINOR_RENDER);if (ret)goto err_minors;ret = drm_minor_register(dev, DRM_MINOR_PRIMARY);if (ret)goto err_minors;ret = create_compat_control_link(dev);if (ret)goto err_minors;dev->registered = true;if (dev->driver->load) {ret = dev->driver->load(dev, flags);if (ret)goto err_minors;}if (drm_core_check_feature(dev, DRIVER_MODESET))drm_modeset_register_all(dev);ret = 0;DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",driver->name, driver->major, driver->minor,driver->patchlevel, driver->date,dev->dev ? dev_name(dev->dev) : "virtual device",dev->primary->index);goto out_unlock;err_minors:remove_compat_control_link(dev);drm_minor_unregister(dev, DRM_MINOR_PRIMARY);drm_minor_unregister(dev, DRM_MINOR_RENDER); out_unlock:mutex_unlock(&drm_global_mutex);return ret; } EXPORT_SYMBOL(drm_dev_register);
重点看一下代码中的这一段程序:
if (dev->driver->load) {ret = dev->driver->load(dev, flags);if (ret)goto err_minors;}
- dev就是
drm_dev_alloc
函数中,新分配并初始化的dev
- dev->driver也就是中的
drm_dev_init
中的dev->driver = driver(kms_driver);
- 也就是说dev->driver实际上指向了kms_driver
- dev->driver->load不言而喻,指向
struct drm_driver kms_driver
中的 .load = radeon_driver_load_kms
- dev就是
radeon_driver_load_kms 这个函数是所有和GPU初始化相关内容的起始点
- radeon_driver_load_kms这个函数是所有和GPU初始化相关的内容的起始点(start)
- 通过调用radeon_device_init来初始化芯片的非显示部分(asic init, CP, writeback等)
- 通过调用radeon_modeset_init来初始化显示部分(CRTC、connector、encoder、hotplug detect等)
源码如下:drivers/gpu/drm/radeon/radeon_kms.c
/*** radeon_driver_load_kms - Main load function for KMS.** @dev: drm dev pointer* @flags: device flags** This is the main load function for KMS (all asics).* It calls radeon_device_init() to set up the non-display* parts of the chip (asic init, CP, writeback, etc.), and* radeon_modeset_init() to set up the display parts* (crtcs, encoders, hotplug detect, etc.).* Returns 0 on success, error on failure.*/ int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags) {struct radeon_device *rdev;int r, acpi_status;#ifdef CONFIG_CPU_LOONGSON3turn_off_lvds(); #endifrdev = kzalloc(sizeof(struct radeon_device), GFP_KERNEL);if (rdev == NULL) {return -ENOMEM;}dev->dev_private = (void *)rdev;/* update BUS flag */if (pci_find_capability(dev->pdev, PCI_CAP_ID_AGP)) {flags |= RADEON_IS_AGP;} else if (pci_is_pcie(dev->pdev)) {flags |= RADEON_IS_PCIE;} else {flags |= RADEON_IS_PCI;}if ((radeon_runtime_pm != 0) &&radeon_has_atpx() &&((flags & RADEON_IS_IGP) == 0) &&!pci_is_thunderbolt_attached(dev->pdev))flags |= RADEON_IS_PX;/* radeon_device_init should report only fatal error* like memory allocation failure or iomapping failure,* or memory manager initialization failure, it must* properly initialize the GPU MC controller and permit* VRAM allocation*/r = radeon_device_init(rdev, dev, dev->pdev, flags);if (r) {dev_err(&dev->pdev->dev, "Fatal error during GPU init\n");goto out;}/* Again modeset_init should fail only on fatal error* otherwise it should provide enough functionalities* for shadowfb to run*/r = radeon_modeset_init(rdev);if (r)dev_err(&dev->pdev->dev, "Fatal error during modeset init\n");#ifdef CONFIG_CPU_LOONGSON3turn_on_lvds(); #endif/* Call ACPI methods: require modeset init* but failure is not fatal*/if (!r) {acpi_status = radeon_acpi_init(rdev);if (acpi_status)dev_dbg(&dev->pdev->dev,"Error during ACPI methods call\n");}if (radeon_is_px(dev)) {dev_pm_set_driver_flags(dev->dev, DPM_FLAG_NEVER_SKIP);pm_runtime_use_autosuspend(dev->dev);pm_runtime_set_autosuspend_delay(dev->dev, 5000);pm_runtime_set_active(dev->dev);pm_runtime_allow(dev->dev);pm_runtime_mark_last_busy(dev->dev);pm_runtime_put_autosuspend(dev->dev);}out:if (r)radeon_driver_unload_kms(dev);return r; }
函数一开始先分配struct radeon_device结构体实例
struct radeon_device *rdev;
,即新建了一个struct radeon_device
设备struct radeon_device
(drivers/gpu/drm/radeon/radeon.h) 的定义如下:/** Core structure, functions and helpers.*/ typedef uint32_t (*radeon_rreg_t)(struct radeon_device*, uint32_t); typedef void (*radeon_wreg_t)(struct radeon_device*, uint32_t, uint32_t);struct radeon_device {struct device *dev;struct drm_device *ddev;struct pci_dev *pdev;struct rw_semaphore exclusive_lock;/* ASIC */union radeon_asic_config config;enum radeon_family family;unsigned long flags;int usec_timeout;enum radeon_pll_errata pll_errata;int num_gb_pipes;int num_z_pipes;int disp_priority;/* BIOS */uint8_t *bios;bool is_atom_bios;uint16_t bios_header_start;struct radeon_bo *stolen_vga_memory;/* Register mmio */resource_size_t rmmio_base;resource_size_t rmmio_size;/* protects concurrent MM_INDEX/DATA based register access */spinlock_t mmio_idx_lock;/* protects concurrent SMC based register access */spinlock_t smc_idx_lock;/* protects concurrent PLL register access */spinlock_t pll_idx_lock;/* protects concurrent MC register access */spinlock_t mc_idx_lock;/* protects concurrent PCIE register access */spinlock_t pcie_idx_lock;/* protects concurrent PCIE_PORT register access */spinlock_t pciep_idx_lock;/* protects concurrent PIF register access */spinlock_t pif_idx_lock;/* protects concurrent CG register access */spinlock_t cg_idx_lock;/* protects concurrent UVD register access */spinlock_t uvd_idx_lock;/* protects concurrent RCU register access */spinlock_t rcu_idx_lock;/* protects concurrent DIDT register access */spinlock_t didt_idx_lock;/* protects concurrent ENDPOINT (audio) register access */spinlock_t end_idx_lock;void __iomem *rmmio;radeon_rreg_t mc_rreg;radeon_wreg_t mc_wreg;radeon_rreg_t pll_rreg;radeon_wreg_t pll_wreg;uint32_t pcie_reg_mask;radeon_rreg_t pciep_rreg;radeon_wreg_t pciep_wreg;/* io port */void __iomem *rio_mem;resource_size_t rio_mem_size;struct radeon_clock clock;struct radeon_mc mc;struct radeon_gart gart;struct radeon_mode_info mode_info;struct radeon_scratch scratch;struct radeon_doorbell doorbell;struct radeon_mman mman;struct radeon_fence_driver fence_drv[RADEON_NUM_RINGS];wait_queue_head_t fence_queue;u64 fence_context;struct mutex ring_lock;struct radeon_ring ring[RADEON_NUM_RINGS];bool ib_pool_ready;struct radeon_sa_manager ring_tmp_bo;struct radeon_irq irq;struct radeon_asic *asic;struct radeon_gem gem;struct radeon_pm pm;struct radeon_uvd uvd;struct radeon_vce vce;uint32_t bios_scratch[RADEON_BIOS_NUM_SCRATCH];struct radeon_wb wb;struct radeon_dummy_page dummy_page;bool shutdown;bool need_dma32;bool need_swiotlb;bool accel_working;bool fastfb_working; /* IGP feature*/bool needs_reset, in_reset;struct radeon_surface_reg surface_regs[RADEON_GEM_MAX_SURFACES];const struct firmware *me_fw; /* all family ME firmware */const struct firmware *pfp_fw; /* r6/700 PFP firmware */const struct firmware *rlc_fw; /* r6/700 RLC firmware */const struct firmware *mc_fw; /* NI MC firmware */const struct firmware *ce_fw; /* SI CE firmware */const struct firmware *mec_fw; /* CIK MEC firmware */const struct firmware *mec2_fw; /* KV MEC2 firmware */const struct firmware *sdma_fw; /* CIK SDMA firmware */const struct firmware *smc_fw; /* SMC firmware */const struct firmware *uvd_fw; /* UVD firmware */const struct firmware *vce_fw; /* VCE firmware */bool new_fw;struct r600_vram_scratch vram_scratch;int msi_enabled; /* msi enabled */struct r600_ih ih; /* r6/700 interrupt ring */struct radeon_rlc rlc;struct radeon_mec mec;struct delayed_work hotplug_work;struct work_struct dp_work;struct work_struct audio_work;int need_recover;struct delayed_work recover_work;int num_crtc; /* number of crtcs */struct mutex dc_hw_i2c_mutex; /* display controller hw i2c mutex */bool has_uvd;bool has_vce;struct r600_audio audio; /* audio stuff */struct notifier_block acpi_nb;/* only one userspace can use Hyperz features or CMASK at a time */struct drm_file *hyperz_filp;struct drm_file *cmask_filp;/* i2c buses */struct radeon_i2c_chan *i2c_bus[RADEON_MAX_I2C_BUS];/* debugfs */struct radeon_debugfs debugfs[RADEON_DEBUGFS_MAX_COMPONENTS];unsigned debugfs_count;/* virtual memory */struct radeon_vm_manager vm_manager;struct mutex gpu_clock_mutex;/* memory stats */atomic64_t vram_usage;atomic64_t gtt_usage;atomic64_t num_bytes_moved;atomic_t gpu_reset_counter;/* ACPI interface */struct radeon_atif atif;struct radeon_atcs atcs;/* srbm instance registers */struct mutex srbm_mutex;/* clock, powergating flags */u32 cg_flags;u32 pg_flags;struct dev_pm_domain vga_pm_domain;bool have_disp_power_ref;u32 px_quirk_flags;/* tracking pinned memory */u64 vram_pin_size;u64 gart_pin_size;struct mutex mn_lock;DECLARE_HASHTABLE(mn_hash, 7); };
分配完内存之后,紧接着就是
dev->dev_private = (void *)rdev
,这句话完成了struct drm_device 和 struct radeon_device的衔接- 关键代码:
struct radeon_device *rdev; rdev = kzalloc(sizeof(struct radeon_device), GFP_KERNEL); dev->dev_private = (void *)rdev;
- radeon_driver_load_kms这个函数是所有和GPU初始化相关的内容的起始点(start)
- 通过调用radeon_device_init来初始化芯片的非显示部分(asic init, CP, writeback等)
- 通过调用radeon_modeset_init来初始化显示部分(CRTC、connector、encoder、hotplug detect等)
分析radeon_driver_load_kms函数中的两大核心函数之一:radeon_device_init
调用处代码:
r = radeon_device_init(rdev, dev, dev->pdev, flags);
源码如下: drivers/gpu/drm/radeon/radeon_device.c
/*** radeon_device_init - initialize the driver** @rdev: radeon_device pointer* @pdev: drm dev pointer* @pdev: pci dev pointer* @flags: driver flags** Initializes the driver info and hw (all asics).* Returns 0 for success or an error on failure.* Called at driver startup.*/ int radeon_device_init(struct radeon_device *rdev,struct drm_device *ddev,struct pci_dev *pdev,uint32_t flags) {int r, i;int dma_bits;bool runtime = false;rdev->shutdown = false;rdev->dev = &pdev->dev;rdev->ddev = ddev;rdev->pdev = pdev;rdev->flags = flags;rdev->family = flags & RADEON_FAMILY_MASK;rdev->is_atom_bios = false;rdev->usec_timeout = RADEON_MAX_USEC_TIMEOUT;rdev->mc.gtt_size = 512 * 1024 * 1024;rdev->accel_working = false;/* set up ring ids */for (i = 0; i < RADEON_NUM_RINGS; i++) {rdev->ring[i].idx = i;}rdev->fence_context = dma_fence_context_alloc(RADEON_NUM_RINGS);DRM_INFO("initializing kernel modesetting (%s 0x%04X:0x%04X 0x%04X:0x%04X 0x%02X).\n",radeon_family_name[rdev->family], pdev->vendor, pdev->device,pdev->subsystem_vendor, pdev->subsystem_device, pdev->revision);/* mutex initialization are all done here so we* can recall function without having locking issues */mutex_init(&rdev->ring_lock);mutex_init(&rdev->dc_hw_i2c_mutex);atomic_set(&rdev->ih.lock, 0);mutex_init(&rdev->gem.mutex);mutex_init(&rdev->pm.mutex);mutex_init(&rdev->gpu_clock_mutex);mutex_init(&rdev->srbm_mutex);init_rwsem(&rdev->pm.mclk_lock);init_rwsem(&rdev->exclusive_lock);init_waitqueue_head(&rdev->irq.vblank_queue);mutex_init(&rdev->mn_lock);hash_init(rdev->mn_hash);r = radeon_gem_init(rdev);if (r)return r;radeon_check_arguments(rdev);/* Adjust VM size here.* Max GPUVM size for cayman+ is 40 bits.*/rdev->vm_manager.max_pfn = radeon_vm_size << 18;/* Set asic functions */r = radeon_asic_setup(rdev);if (r)return r;/* all of the newer IGP chips have an internal gart* However some rs4xx report as AGP, so remove that here.*/if ((rdev->family >= CHIP_RS400) &&(rdev->flags & RADEON_IS_IGP)) {rdev->flags &= ~RADEON_IS_AGP;}if (rdev->flags & RADEON_IS_AGP && radeon_agpmode == -1) {radeon_agp_disable(rdev);}/* Set the internal MC address mask* This is the max address of the GPU's* internal address space.*/if (rdev->family >= CHIP_CAYMAN)rdev->mc.mc_mask = 0xffffffffffULL; /* 40 bit MC */else if (rdev->family >= CHIP_CEDAR)rdev->mc.mc_mask = 0xfffffffffULL; /* 36 bit MC */elserdev->mc.mc_mask = 0xffffffffULL; /* 32 bit MC *//* set DMA mask + need_dma32 flags.* PCIE - can handle 40-bits.* IGP - can handle 40-bits* AGP - generally dma32 is safest* PCI - dma32 for legacy pci gart, 40 bits on newer asics*/rdev->need_dma32 = false;if (rdev->flags & RADEON_IS_AGP)rdev->need_dma32 = true;if ((rdev->flags & RADEON_IS_PCI) &&(rdev->family <= CHIP_RS740))rdev->need_dma32 = true; #ifdef CONFIG_PPC64if (rdev->family == CHIP_CEDAR)rdev->need_dma32 = true; #endifdma_bits = rdev->need_dma32 ? 32 : 40;r = pci_set_dma_mask(rdev->pdev, DMA_BIT_MASK(dma_bits));if (r) {rdev->need_dma32 = true;dma_bits = 32;pr_warn("radeon: No suitable DMA available\n");}r = pci_set_consistent_dma_mask(rdev->pdev, DMA_BIT_MASK(dma_bits));if (r) {pci_set_consistent_dma_mask(rdev->pdev, DMA_BIT_MASK(32));pr_warn("radeon: No coherent DMA available\n");}rdev->need_swiotlb = drm_get_max_iomem() > ((u64)1 << dma_bits);/* Registers mapping *//* TODO: block userspace mapping of io register */spin_lock_init(&rdev->mmio_idx_lock);spin_lock_init(&rdev->smc_idx_lock);spin_lock_init(&rdev->pll_idx_lock);spin_lock_init(&rdev->mc_idx_lock);spin_lock_init(&rdev->pcie_idx_lock);spin_lock_init(&rdev->pciep_idx_lock);spin_lock_init(&rdev->pif_idx_lock);spin_lock_init(&rdev->cg_idx_lock);spin_lock_init(&rdev->uvd_idx_lock);spin_lock_init(&rdev->rcu_idx_lock);spin_lock_init(&rdev->didt_idx_lock);spin_lock_init(&rdev->end_idx_lock);if (rdev->family >= CHIP_BONAIRE) {rdev->rmmio_base = pci_resource_start(rdev->pdev, 5);rdev->rmmio_size = pci_resource_len(rdev->pdev, 5);} else {rdev->rmmio_base = pci_resource_start(rdev->pdev, 2);rdev->rmmio_size = pci_resource_len(rdev->pdev, 2);}rdev->rmmio = ioremap(rdev->rmmio_base, rdev->rmmio_size);if (rdev->rmmio == NULL)return -ENOMEM;/* doorbell bar mapping */if (rdev->family >= CHIP_BONAIRE)radeon_doorbell_init(rdev);/* io port mapping */for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {if (pci_resource_flags(rdev->pdev, i) & IORESOURCE_IO) {rdev->rio_mem_size = pci_resource_len(rdev->pdev, i);rdev->rio_mem = pci_iomap(rdev->pdev, i, rdev->rio_mem_size);break;}}if (rdev->rio_mem == NULL)DRM_ERROR("Unable to find PCI I/O BAR\n");if (rdev->flags & RADEON_IS_PX)radeon_device_handle_px_quirks(rdev);/* if we have > 1 VGA cards, then disable the radeon VGA resources *//* this will fail for cards that aren't VGA class devices, just* ignore it */vga_client_register(rdev->pdev, rdev, NULL, radeon_vga_set_decode);if (rdev->flags & RADEON_IS_PX)runtime = true;if (!pci_is_thunderbolt_attached(rdev->pdev))vga_switcheroo_register_client(rdev->pdev,&radeon_switcheroo_ops, runtime);if (runtime)vga_switcheroo_init_domain_pm_ops(rdev->dev, &rdev->vga_pm_domain);r = radeon_asic_init(rdev);if (r)goto failed;r = radeon_gem_debugfs_init(rdev);if (r) {DRM_ERROR("registering gem debugfs failed (%d).\n", r);}r = radeon_mst_debugfs_init(rdev);if (r) {DRM_ERROR("registering mst debugfs failed (%d).\n", r);}if (rdev->flags & RADEON_IS_AGP && !rdev->accel_working) {/* Acceleration not working on AGP card try again* with fallback to PCI or PCIE GART*/radeon_asic_reset(rdev);radeon_asic_fini(rdev);radeon_agp_disable(rdev);r = radeon_asic_init(rdev);if (r)goto failed;}r = radeon_ib_ring_tests(rdev);if (r)DRM_ERROR("ib ring test failed (%d).\n", r);/** Turks/Thames GPU will freeze whole laptop if DPM is not restarted* after the CP ring have chew one packet at least. Hence here we stop* and restart DPM after the radeon_ib_ring_tests().*/if (rdev->pm.dpm_enabled &&(rdev->pm.pm_method == PM_METHOD_DPM) &&(rdev->family == CHIP_TURKS) &&(rdev->flags & RADEON_IS_MOBILITY)) {mutex_lock(&rdev->pm.mutex);radeon_dpm_disable(rdev);radeon_dpm_enable(rdev);mutex_unlock(&rdev->pm.mutex);}if ((radeon_testing & 1)) {if (rdev->accel_working)radeon_test_moves(rdev);elseDRM_INFO("radeon: acceleration disabled, skipping move tests\n");}if ((radeon_testing & 2)) {if (rdev->accel_working)radeon_test_syncing(rdev);elseDRM_INFO("radeon: acceleration disabled, skipping sync tests\n");}if (radeon_benchmarking) {if (rdev->accel_working)radeon_benchmark(rdev, radeon_benchmarking);elseDRM_INFO("radeon: acceleration disabled, skipping benchmarks\n");}return 0;failed:/* balance pm_runtime_get_sync() in radeon_driver_unload_kms() */if (radeon_is_px(ddev))pm_runtime_put_noidle(ddev->dev);if (runtime)vga_switcheroo_fini_domain_pm_ops(rdev->dev);return r; }
- 前文提过,
radeon_device_init
函数的作用是初始化芯片的非显示部分(asic init, CP, writeback等) radeon_device_init
函数首先会初始化一大堆的驱动需要使用的结构,然后调用radeon_asic_init
- **radeon_asic_init(rdev)**用于设置电路相关的一些函数指针,比如睡眠/恢复调用,硬件重置,设置和处理中断请求,设置和获取时钟等等。
- 我们再来看一下:radeon_driver_load_kms --> radeon_device_init --> radeon_gem_init
- 源码如下:
drivers/gpu/drm/radeon/radeon_gem.c
int radeon_gem_init(struct radeon_device *rdev){INIT_LIST_HEAD(&rdev->gem.objects);return 0;}
终于看到一个短的函数了!那么我们就有篇幅展开一个相关知识——内核链表。
在Linux内核中,提供了一个用来创建双向循环链表的结构 list_head。
虽然linux内核是用C语言写的,但是list_head的引入,使得内核数据结构也可以拥有面向对象的特性,通过使用操作list_head 的通用接口很容易实现代码的重用,有点类似于C++的继承机制。
内核链表的相关代码在include/linux/list.h中,下面分别来看。
#define LIST_HEAD_INIT(name) { &(name), &(name) }#define LIST_HEAD(name) \struct list_head name = LIST_HEAD_INIT(name)static inline void INIT_LIST_HEAD(struct list_head *list) {WRITE_ONCE(list->next, list);list->prev = list; }
struct list_head
的定义在include/linux/types.h
中struct list_head {struct list_head *next, *prev;};
需要注意的一点是,头结点head是不适用的,这点需要特别注意。
使用list_head组织的链表的结构如下图所示:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-o8T4kHtB-1618577435224)(/home/fakechen/文档/Tool/photo/image-20210401154913731.png)]
介绍了内核链表的相关知识之后,回到
radeon_gem_init
函数中来INIT_LIST_HEAD(&rdev->gem.objects);
这一句代码中rdev是struct radeon_device的实例,struct radeon_device结构体中的有一个gem成员:
struct radeon_device {struct radeon_gem gem;}
其定义为drivers/gpu/drm/radeon/radeon.h
struct radeon_gem {struct mutex mutex;struct list_head objects;};
这样函数的作用就很明确了,初始化内核链表rdev->gem.objects
- 前文提过,
Radeon Gpu源码分析相关推荐
- linux内核radeon gpu源码解析3 —— Radeon初始化
解析DRM代码,以从底层介绍显卡驱动的初始化过程,显卡类型是AMD的radeon r600以后的系列显卡.基本的过程就是驱动载入,硬件初始化,设置硬件独立的模块(如内存管理器),设置显示(分辨率等). ...
- linux内核radeon gpu源码解析1 —— 什么是radeon
简介 Radeon(中文名称称为镭龙™)是一个英文产品的商标.Radeon是AMD公司出品的显示芯片的一种系列.俗称A卡. 全称一般写作:AMD Radeon HD xxxx ,例如台式机显卡型号:A ...
- Yolov3Yolov4网络结构与源码分析
Yolov3&Yolov4网络结构与源码分析 从2018年Yolov3年提出的两年后,在原作者声名放弃更新Yolo算法后,俄罗斯的Alexey大神扛起了Yolov4的大旗. 文章目录 论文汇总 ...
- openmp官方源码_MNN推理过程源码分析笔记(一)主流程
在正式开始推理代码分析之前, 回顾下 MNN整体结构 推理分为三个大部分 Engine Backends Runtime Optimize 那么问题来了,从哪里开始,怎么入手呢? 我的心得是源码分析不 ...
- BERT源码分析(PART III)
写在前面 继续之前没有介绍完的 Pre-training 部分,在上一篇中(BERT源码分析(PART II))我们已经完成了对输入数据的处理,接下来看看 BERT 是怎么完成「Masked LM」和 ...
- THOR:MindSpore 自研高阶优化器源码分析和实践应用
摘要:这篇文章跟大家分享下THOR的实践应用.THOR算法的部分内容当前已经在MindSpore中开源 本文分享自华为云社区<MindSpore 自研高阶优化器源码分析和实践应用>,原文作 ...
- caffe.proto源码分析
一什么是protocol buffer 二caffeproto中的几个重要数据类型 三caffeproto源码 分析caffe源码,看首先看caffe.proto,是明智的选择.好吧,我不是创造者,只 ...
- Nouveau源码分析(三):NVIDIA设备初始化之nouveau_drm_probe
Nouveau源码分析(三) 向DRM注册了Nouveau驱动之后,内核中的PCI模块就会扫描所有没有对应驱动的设备,然后和nouveau_drm_pci_table对照. 对于匹配的设备,PCI模块 ...
- caffe源码分析--SyncedMemory 内存管理机制
caffe源码分析–SyncedMemory 内存管理机制 SyncedMemory 是caffe中用来管理内存分配和CPU.GPU数据及同步的类,只服务于Blob类.SyncedMemory 对 ...
最新文章
- 3 月,跳还是不跳?
- 路由与交换大作业pkt_干货 | 交换机“练功大法”——略有小成(一)
- Simulink中DPCM量化和编码仿真
- PostgreSQL体系结构之物理结构
- 项目中的一个技术方案替换历程(surfaceview+fragment 变成悬浮窗window)
- 抗腹泻药行业调研报告 - 市场现状分析与发展前景预测
- android 音频播放插件,在android中的listview中实现音频播放器
- windows bat定时重启软件
- matlab,python 写kml文件(点,线,多边形)
- Office之word如何删除页眉横线
- matlab怎么做多元非线性拟合,如何用matlab进行多元非线性拟合
- 【读书】格鲁夫给经理人的第一课-管理杠杆率
- android短信发不了图片,手机发不出短信怎么办?-安卓手机发不出短信的解决方法 - 河东软件园...
- 【医疗人工智能论文】使用深度强化学习的腹腔镜机器人辅助训练
- Bootstrap导航和导航条
- 京东单品页前端开发那些不得不说的事儿 1
- 单位换算 M、Mb、MB
- 完成全球92万+集装箱箱况残损检验, 完成上亿集装箱信息识别, 中集飞瞳成熟集装箱码头人工智能,全球集装箱人工智能垂直领域领军者
- [Python]小甲鱼Python视频第019课(函数:我的地盘听我的)课后题及参考解答
- MySQL8 创建主键ID
热门文章
- 单片机工程师笔试题目归纳汇总
- shell编程发送按键
- 手机邮箱好处,邮箱格式地址,安全邮箱号是什么样的?
- 周云的FLASH小游戏开发教室
- Bi-directional Cross-Modality Feature Propagation with Separation-and Aggregation Gate_eccv2020
- 怎样用万用表检查线路是短路还是接地?
- 一键领取饿了么手气最佳红包
- 【r语言plot报错】Error in plot.window(...) : ‘xlim‘值不能是无限的/ need finite ‘xlim’ values
- (十一)Unity5新特性----实战2D游戏
- closeEvent