期刊导航

Totally found 456 items.

  • [会议] Deep Learning Based Hand Detection in Cluttered Environment Using Skin Segmentation
    Robust detection of hand gestures has remained a challenge due to background clutter encountered in real-world environments. In this work, a two-stage deep learning based approach is presented to detect hands robustly in unconstrained scenarios. We evaluate two recently proposed object detection techniques to initially locate hands in the input images. To further enhance the output of the hand detector we propose a convolutional neural network (CNN) based skin detection technique which reduces occurrences of false positives significantly. We show qualitative and quantitative results of the proposed hand detection algorithm on several public datasets including Oxford, 5-signer and EgoHands dataset. As a case study, we also report hand detection results robust to clutter on a proposed dataset of Indian classical dance (ICD) images.
    关键词: Skin;Proposals;Detectors;Robustness;Image segmentation;Machine learning;Object detection
  • [会议] What are the Visual Features Underlying Human Versus Machine Vision?
    Although Deep Convolutional Networks (DCNs) are approaching the accuracy of human observers at object recognition, it is unknown whether they leverage similar visual representations to achieve this performance. To address this, we introduce Clicktionary, a web-based game for identifying visual features used by human observers during object recognition. Importance maps derived from the game are consistent across participants and uncorrelated with image saliency measures. These results suggest that Clicktionary identifies image regions that are meaningful and diagnostic for object recognition but different than those driving eye movements. Surprisingly, Clicktionary importance maps are only weakly correlated with relevance maps derived from DCNs trained for object recognition. Our study demonstrates that the narrowing gap between the object recognition accuracy of human observers and DCNs obscures distinct visual strategies used by each to achieve this performance.
    关键词: Games;Visualization;Object recognition;Image recognition;Observers;Mice
  • [会议] Hierarchical Grouping — The Gestalt Assessments Method
    Real images contain reflection symmetry and repetition in rows with high probability. I.e. certain parts can be mapped on other certain parts by the usual Gestalt laws and are repeated there with high similarity. Moreover, such mapping comes in nested hierarchies - e.g. a reflection Gestalt that is made of repetition friezes, whose parts are again reflection symmetric compositions. It is our intention to develop and test methods that may automatically find, parametrize, and assess such nested hierarchies. This can be explicitly modelled by continuous assessment functions. The recognition performance is raised utilizing additional features such as colors. This paper reports examples from the 2017 data set.
    关键词: Reflection;Image color analysis;Aggregates;Heating systems;Conferences;Grammar
  • [会议] Deep Domain Adaptation by Geodesic Distance Minimization
    In this paper, we propose a new approach called Deep LogCORAL for unsupervised visual domain adaptation. Our work builds on the recently proposed Deep CORAL method, which aims to train a convolutional neural network and simultaneously minimize the Euclidean distance of convariance matrices between the source and target domains. By observing that the second order statistical information (i.e. the covariance matrix) lies on a Riemannian manifold, we propose to use the Riemannian distance, approximated by Log-Euclidean distance, to replace the naive Euclidean distance in Deep CORAL. We also consider first-order information, and minimize the distance of mean vectors between two domains. We build an end-to-end model, in which we minimize both the classification loss, and the domain difference based on the first-order and second-order information between two domains. Our experiments on the benchmark Office dataset demonstrates the improvements of our newly proposed Deep LogCORAL approach over the Deep CORAL method, as well as the further improvement when optimizing both orders of information.
    关键词: Covariance matrices;Euclidean distance;Manifolds;Visualization;Feature extraction;Adaptation models;Training data
  • [会议] Deep Census: AUV-Based Scallop Population Monitoring
    We describe an integrated system for vision-based counting of wild scallops in order to measure population health, particularly pre- and post-dredging in fisheries areas. Sequential images collected by an autonomous underwater vehicle (AUV) are independently analyzed by a convolutional neural network based on the YOLOv2 architecture [18], which offers state-of-the-art object detection accuracy at real-time speeds. To augment the training dataset, a denoising auto-encoder network is used to automatically upgrade manually-annotated approximate object positions to full bounding boxes, increasing the detection network's performance. The system can act as a tool to improve or even replace an existing offline manual annotation workflow, and is fast enough to function "in the loop" for AUV control.
    关键词: Image color analysis;Manuals;Sociology;Statistics;Object detection;Cameras
  • [会议] Fast Approximate Karhunen-Loève Transform for Three-Way Array Data
    Organs, cells and microstructures in cells dealt with in biomedical image analysis are volumetric data. We are required to process and analyse these data as volumetric data without embedding into higher-dimensional vector space from the viewpoints of object oriented data analysis. Sampled values of volumetric data are expressed as three-way array data. Therefore, principal component analysis of multi-way data is an essential technique for subspace-based pattern recognition, data retrievals and data compression of volumetric data. For one-way array (the vector form) problem the discrete cosine transform matrix is a good relaxed solution of the eigenmatrix for principal component analysis. This algebraic property of principal component analysis, derives an approximate fast algorithm for PCA of three-way data arrays.
    关键词: Tensile stress;Principal component analysis;Arrays;Pattern recognition;Discrete cosine transforms;Optimization
  • [会议] Mind the Gap: Virtual Shorelines for Blind and Partially Sighted People
    Blind and partially sighted people have encountered numerous devices to improve their mobility and orientation, yet most still rely on traditional techniques, such as the white cane or a guide dog. In this paper, we consider improving the actual orientation process through the creation of routes that are better suited towards specific needs. More precisely, this work focuses on routing for blind and partially sighted people on a shoreline like level of detail, modeled after real world white cane usage. Our system is able to create such fine-grained routes through the extraction of routing features from openly available geolocation data, e.g., building facades and road crossings. More importantly, the generated routes provide a measurable safety benefit, as they reduce the number of unmarked pedestrian crossings and try to utilize much more accessible alternatives. Our evaluation shows that such a fine-grained routing can improve users' safety and improve their understanding of the environment lying ahead, especially the upcoming route and its impediments.
    关键词: Routing;Roads;Urban areas;Buildings;Safety;Global navigation satellite system
  • [会议] The Benefits of Evaluating Tracker Performance Using Pixel-Wise Segmentations
    For years, the ground truth data for evaluating object trackers consists of axis-aligned or oriented boxes. This greatly reduces the workload of labeling the datasets in the common benchmarks. Nevertheless, boxes are a very coarse approximation of an object and the approximation by a box has a large degree of ambiguity. Furthermore, tracking approaches that are not restricted to boxes cannot be evaluated within the benchmarks without adding a penalty to them. We present a simple extension to the VOT evaluation procedure that enables to include these approaches. Furthermore, we present upper bounds for trackers restricted to boxes. Moreover, we present a new measure that captures how well an approach can cope with scale changes without the need of frame-wise labels. We present a learning-based approach which helps to identify frames with heavy occlusion automatically. The framework is tested on the segmentations of the VOT2016 dataset.
    关键词: Image segmentation;Upper bound;Optimized production technology;Benchmark testing;Reliability;Current measurement;Visualization
  • [会议] The Benefits of Evaluating Tracker Performance Using Pixel-Wise Segmentations
    For years, the ground truth data for evaluating object trackers consists of axis-aligned or oriented boxes. This greatly reduces the workload of labeling the datasets in the common benchmarks. Nevertheless, boxes are a very coarse approximation of an object and the approximation by a box has a large degree of ambiguity. Furthermore, tracking approaches that are not restricted to boxes cannot be evaluated within the benchmarks without adding a penalty to them. We present a simple extension to the VOT evaluation procedure that enables to include these approaches. Furthermore, we present upper bounds for trackers restricted to boxes. Moreover, we present a new measure that captures how well an approach can cope with scale changes without the need of frame-wise labels. We present a learning-based approach which helps to identify frames with heavy occlusion automatically. The framework is tested on the segmentations of the VOT2016 dataset.
    关键词: Image segmentation;Upper bound;Optimized production technology;Benchmark testing;Reliability;Current measurement;Visualization
  • [会议] Symmetry-Factored Statistical Modelling of Craniofacial Shape
    We present a new method for symmetry-factored statistical modelling of 3D shape. Our method comprises three novel components. First, a means to symmetrise a 3D mesh, regularised using the Laplace-Beltrami operator. Second, a symmetry-aware variant of Generalized Procrustes Analysis (GPA). Third, a means to compute a linear statistical shape model in which symmetry and asymmetric shape variation are modelled separately. We focus on human head data and build the first 3D morphable model of craniofacial asymmetry. The qualitative and quantitative evaluation demonstrates that the proposed model outperforms a linear model that does not decompose symmetric and asymmetric variation. It also validates that symmetry-aware GPA can improve the data generalisation and reconstruction ability of the standard PCA model. We will make our model and the implementation of our method publicly available1.
    关键词: Shape;Three-dimensional displays;Computational modeling;Solid modeling;Principal component analysis;Face;Biological system modeling
意见反馈
返回顶部