access deny [1301]
access deny [1026]
Fractal Dimension (FD) is widely used for image segmentation because of its successful approach toward quantifying texture information. In this paper, we present a FD-based multi-focus image fusion method that utilizes FD to identify focused regions, as the primary step for the multi-focus image fusion process. The algorithm aims to extract the local FD features of each multi-focus pair estimated using the differential box-counting method. A guided filter is employed to further specify the spatial information and increase the robustness of the FD features to noise. The outcome would be analyzed to achieve a focus map that identifies sharp regions in each partially focused image. Afterwards, the detected regions are combined into a single al
We have recently proposed the approximate median filters (APMF). They are based on the sorting network and achieve acceptable image quality under low-cost hardware. In this brief, we develop a specific comparator to improve the capabilities of those filters in noise elimination. The architecture of our inexact median filters (IMF) is regular and modular. Also, we introduce the histogram based error dispersion plot as a new error evaluation method to have a better assessment of IMF performance. Simulation results show that the proposed filter is effectively low cost in power, area, and speed. Despite the trade-off between the filtering accuracy and circuit characteristics, the output quality of the filter is largely similar to that of the pr
In this paper, an off-line method, based on hidden Markov model, HMM, is used for holistic recognition of handwritten words of a limited vocabulary. Three feature sets based on image gradient, black–white transition and contour chain code are used. For each feature set an HMM is trained for each word. In the recognition step, the outputs of these classifiers are combined through a multilayer perceptron, MLP. High number of connections in this network causes a computational complexity in the training. To avoid this problem, a new method is proposed. In the experiments on 16000 images of 200 names of Iranian cities, from “Iranshahr 3” dataset, the results of the proposed method are presented and compared with some similar methods. An er
The musical source separation deals with extracting the musical signals from a mixture. To attain this goal, one of the efficient methods is to decompose the mixture into a dictionary of some basic functions that inherently describe the instruments. Usually, a unique function is synthesized for each of the notes of each instrument, called the note-specific atom. In this paper, a sine-harmonic model is utilized to synthesize note-specific atoms and the note’s fundamental frequency is used as a prior information to determine the model parameters. To calculate these parameters, the training signal spectrum is processed only around the main note harmonics. Experimental results demonstrated that the proposed method is much faster
Despite the recent progress in OCR technologies, whole-book recognition, is still a challenging task, in particular in case of old and historical books, that the unknown font faces or low quality of paper and print contributes to the challenge. Therefore, pre-trained recognizers and generic methods do not usually perform up to required standards, and usually the performance degrades for larger scale recognition tasks, such as of a book. Such reportedly low error-rate methods turn out to require a great deal of manual correction. Generally, such methodologies do not make effective use of concepts such redundancy in whole-book recognition. In this work, we propose to train Long Short Term Memory (LSTM) networks on a minimal training set obtai
An attractive topic of Music Information Retrieval (MIR) is focused on query-by-example (QBE), which receives a user-provided query and aims to find the target song from an associated music dataset. In this paper, we use feature and decision fusion techniques to develop a two-stage accurate and rapid QBE based MIR system. For this purpose, a proposed diverse ensemble of recognizers automatically recognizes the genre of the query in first stage. This diversity is yielded through feature extraction over different frequency bands followed by feature fusion to train the recognizers, and then a decision fusion technique fuses the individual results obtained by members of ensemble. Second stage measures similarity between query and o
In this study, a method for holistic recognition of handwritten Farsi words is proposed, which fuses the outputs of right-to-left (RtL) and left-to-right (LtR) hidden Markov models (HMMs). The experimental results on 16,000 images of 200 names of Iranian cities, from the ‘Iranshahr 3’ are presented and compared with those methods using only RtL or LtR models. Experimental results show that the main sources of error are similar beginnings or similar endings of the words. Since RtL and LtR models when dealing with the words behave differently, there is notable error diversity between the two classifiers in such a way that their combination increases the recognition rate. Compared to the RtL-HMM, the product of output scores of the RtL and
This paper develops a new method of onset detection for the Tar, a traditional Iranian musical instrument. The proposed method is based on both types of pitch and energy features. Therefore, it can be utilized to detect either soft or hard onsets. Through this combination, we obtained a more precise separation between two adjacent notes. This ability is especially useful to detect the reaz, repeatedly played notes with the same frequency and short durations. For the evaluation of the method, a data set with predetermined onsets was produced and the results were compared with an energy-based method explained in terms of F-measure.
In this paper, we present a method for color reduction of Persian carpet cartoons that increases both speed and accuracy of editing. Carpet cartoons are in two categories: machine-printed and hand-drawn. Hand-drawn cartoons are divided into two groups: before and after discretization. The purpose of this study is color reduction of hand-drawn cartoons before discretization. The proposed algorithm consists of the following steps: image segmentation, finding the color of each region, color reduction around the edges and final color reduction with C-means. The proposed method requires knowing the desired number of colors in any cartoon. In this method, the number of colors is not reduced to more than about 1.3 times of the desired number. Auto
In the dictionary-based image super-resolution (SR) methods, the resolution of the input image is enhanced using a dictionary of low-resolution (LR) and high-resolution (HR) image patches. Typically, a single dictionary is learned from all the patches in the training set. Then, the input LR patch is super-resolved using its nearest LR patches and their corresponding HR patches in the dictionary. In this paper, we propose a text-image SR method using multiple class-specific dictionaries. Each dictionary is learned from the patches of images of a specific character in the training set. The input LR image is segmented into text lines and characters, and the characters are preliminarily classified. Likewise, overlapping patches a
In color reduction algorithms the result will be evaluated based on visual or qualitative standards. Evaluation without considering the quantitative standard wouldn't be a complete and accurate evaluation and trends of viewer are very effective on the evaluation. In some articles, the result will be evaluated with MSE. In this standard error the difference between the final images’ pixels color with first image will be considered as a failure in which is not a suitable technique for evaluating of color reduction methods. In images color reduction, if a color completely be replaced by a color closed to the original color it wouldn’t be considered as a failure. If these replacements don’t happen for all of those specific color pixels, t
In this paper, a sparse method is proposed to synthesize the note-specific atoms for musical notes of different instruments, and is applied to separate the sounds of two instruments coexisting in a monaural mixture. The main idea is to explore the inherent time structures of the musical notes by a novel adaptive method. These structures are used to synthesize some time-domain functions called note-specific atoms. The note-specific atoms of different instruments are integrated in a global dictionary. In this dictionary, there is only one note-specific atom for each note of any instrument, resulting in a sparse space for each instrument. The signal separation is done by mapping the mixture signal to the global dictionary. The signal related t
Historical printed books OCR is one of the challenging tasks in the area of document image analysis. Low quality of print and paper and unfamiliar font faces are the most known problems. However, redundancy of word and sub-word occurrences in the document can be used to improve the recognition results. In this paper, we propose a highly accurate recognition system for printed old books. We use the combination of sub-word clustering and a LSTM neural network as a character recognizer to reduce the error rate. Due to the lack of information about the font faces, we manually label some part of the books. We show that the recognition error rate can be reduced noticeably by combining the results of the LSTM recognizer and sub-word clustering and
In text images, there are some frequently used characters repeating more than others. Likewise, some characters have common strokes. This characteristic is used in this paper for machine-printed text-image super resolution. After segmenting the input low-resolution image into text lines and characters, 1) the characters are clustered and the clusters with large number of members, corresponding to the frequent characters, are detected. 2) A text-specific multiple-image super resolution is applied to the members of each large cluster and the result is verified by the recognition confidence of an OCR system. 3) A training example set is then constructed by extracting patches from the low-resolution frequent characters and their
This paper develops a new method of onset detection for the Tar, a traditional Iranian musical instrument. The proposed method is based on both types of pitch and energy features and an adaptive peak picking algorithm is utilized for primary onset detection. An improved template matching method is used to detect fundamental frequencies and finally, onsets are tagged based on primary onsets and fundamental frequencies. This step is especially useful to detect the reaz, repeatedly played notes with the same frequency and short durations. For the evaluation of the method, a data set with predetermined onsets was produced and the results were compared with an energy based method explained in terms of F measure.
In this paper, a new method for resolution enhancement of single document images is presented. The proposed method is example based using an example set of low-resolution and high-resolution training patches. According to the Bayes rule, one function is considered as the likelihood or data-fidelity term that measures the fidelity of the output high-resolution to the input low-resolution image. As well, three other functions are considered as the regularization terms containing the prior knowledge about the desired high-resolution document image. Three priors which are fulfilled by the regularization terms are bimodality of document images, smoothness of background and text regions, and similarity to the patches in the example set. By minimi
In the Bayesian image super resolution (SR), a regularisation term is minimised along with a data-fidelity term to generate a high-resolution (HR) image from input low-resolution (LR) image. The regularisation term is incorporated into the SR to fulfil a prior knowledge over the HR image. For instance, smoothness in the background and foreground regions is a prior knowledge about document images. The bilateral total variation (BTV), as a known regularisation term, uniformly smooths the image in all directions while preserving the edges. In this study, the authors present a document image SR method by introducing a new regularisation term called the stroke width-based directional total variation (SWDTV). It is a modified version of the BTV,
Learning handwriting categories fail to perform well when trained and tested on data from different databases. In this paper, we propose a novel large margin domain adaptation algorithm which is able to learn a transformation between training and test datasets in addition to adapting the parameters of classifier using a few or even no training labeled samples from target handwriting dataset. Additionally, we developed a framework of ensemble projection feature learning for datasets representation as a front end for our algorithm to utilize the abundant unlabeled samples in target domain. Experiments on different handwritten digit datasets adaptations demonstrate that the proposed large margin domain adaptation algorithm achieves superior cl