C. Lee Giles

Orcid: 0000-0002-1931-585X

Affiliations:
  • Penn State, University Park, USA


According to our database1, C. Lee Giles authored at least 535 papers between 1985 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2006, "For contributions to information processing and web analysis.".

IEEE Fellow

IEEE Fellow 1997, "For contributions to the theory and practice of neural networks.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A provably stable neural network Turing Machine with finite precision and time.
Inf. Sci., February, 2024

SciCapenter: Supporting Caption Composition for Scientific Figures with Machine-Generated Captions and Ratings.
CoRR, 2024

Automated Detection and Analysis of Data Practices Using A Real-World Corpus.
CoRR, 2024

Stability Analysis of Various Symbolic Rule Extraction Methods from Recurrent Neural Network.
CoRR, 2024

2023
On the Computational Complexity and Formal Hierarchy of Second Order Recurrent Neural Networks.
CoRR, 2023

On the Tensor Representation and Algebraic Homomorphism of the Neural State Turing Machine.
CoRR, 2023

Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization.
Proceedings of the 16th International Natural Language Generation Conference, 2023

A Prototype Hybrid Prediction Market for Estimating Replicability of Published Work.
Proceedings of the HHAI 2023: Augmenting Human Intellect, 2023

GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Privacy Lost and Found: An Investigation at Scale of Web Privacy Policy Availability.
Proceedings of the ACM Symposium on Document Engineering 2023, 2023

Privacy Now or Never: Large-Scale Extraction and Analysis of Dates in Privacy Policy Text.
Proceedings of the ACM Symposium on Document Engineering 2023, 2023

Artificial Prediction Markets Present a Novel Opportunity for Human-AI Collaboration.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Backpropagation-Free Deep Learning with Recursive Local Representation Alignment.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

ACL-Fig: A Dataset for Scientific Figure Classification.
Proceedings of the Workshop on Scientific Document Understanding co-located with 37th AAAI Conference on Artificial Inteligence (AAAI 2023), 2023

2022
Artificial prediction markets present a novel opportunity for human-AI collaboration.
CoRR, 2022

A Study of Computational Reproducibility using URLs Linking to Open Access Datasets and Software.
Proceedings of the Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25, 2022

Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Design Considerations for a Sustainable Scholarly Big Data Service.
Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation, 2022

Scholarly big data quality assessment: a case study of document linking and conflation with S2ORC.
Proceedings of the 22nd ACM Symposium on Document Engineering, 2022

Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder.
Proceedings of the Data Compression Conference, 2022

SciBERTSUM: Extractive Summarization for Scientific Documents.
Proceedings of the Document Analysis Systems - 15th IAPR International Workshop, 2022

A Synthetic Prediction Market for Estimating Confidence in Published Work.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Three Benchmark Datasets for Scholarly Article Layout Analysis.
Dataset, May, 2021

Guest Editorial: Scholarly Big Data.
IEEE Trans. Emerg. Top. Comput., 2021

Screenomics: A Framework to Capture and Analyze Personal Life Experiences and the Ways that Technology Shapes Them.
Hum. Comput. Interact., 2021

An Entropy Metric for Regular Grammar Classification and Learning with Recurrent Neural Networks.
Entropy, 2021

Extractive Research Slide Generation Using Windowed Labeling Ranking.
CoRR, 2021

Predicting the Reproducibility of Social and Behavioral Science Papers Using Supervised Learning Models.
CoRR, 2021

Design and Analysis of a Synthetic Prediction Market using Dynamic Convex Sets.
CoRR, 2021

Extraction and Evaluation of Statistical Information from Social and Behavioral Science Papers.
Proceedings of the Companion of The Web Conference 2021, 2021

What Were People Searching For? A Query Log Analysis of An Academic Search Engine.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2021

ChartReader: Automatic Parsing of Bar-Plots.
Proceedings of the 22nd IEEE International Conference on Information Reuse and Integration for Data Science, 2021

PrivaSeer: A Privacy Policy Search Engine.
Proceedings of the Web Engineering - 21st International Conference, 2021

Investigating Backpropagation Alternatives when Learning to Dynamically Count with Recurrent Neural Networks.
Proceedings of the 15th International Conference on Grammatical Inference, 2021

Recognizing Long Grammatical Sequences using Recurrent Networks Augmented with an External Differentiable Stack.
Proceedings of the 15th International Conference on Grammatical Inference, 2021

Document Domain Randomization for Deep Learning Document Layout Extraction.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

SciCap: Generating Captions for Scientific Figures.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Math Question Solving and MCQ Distractor Generation with attentional GRU Networks.
Proceedings of the 14th International Conference on Educational Data Mining, 2021

A large-scale exploration of terms of service documents on the web.
Proceedings of the DocEng '21: ACM Symposium on Document Engineering 2021, 2021

SlideGen: an abstractive section-based slide generator for scholarly documents.
Proceedings of the DocEng '21: ACM Symposium on Document Engineering 2021, 2021

An Empirical Analysis of Recurrent Learning Algorithms in Neural Lossy Image Compression Systems.
Proceedings of the 31st Data Compression Conference, 2021

OmniLayout: Room Layout Reconstruction From Indoor Spherical Panoramas.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Ranked List Fusion and Re-ranking with Pre-trained Transformers for ARQMath Lab.
Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to, 2021

Building an Accessible, Usable, Scalable, and Sustainable Service for Scholarly Big Data.
Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), 2021

Privacy at Scale: Introducing the PrivaSeer Corpus of Web Privacy Policies.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Author Homepage Discovery in CiteSeerX.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Understanding and Predicting Retractions of Published Work.
Proceedings of the Workshop on Scientific Document Understanding co-located with 35th AAAI Conference on Artificial Inteligence, 2021

Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations.
IEEE Trans. Neural Networks Learn. Syst., 2020

A Neural State Pushdown Automata.
IEEE Trans. Artif. Intell., 2020

Shapley Homology: Topological Analysis of Sample Influence for Neural Networks.
Neural Comput., 2020

Large Scale Subject Category Classification of Scholarly Papers With Deep Attentive Neural Networks.
Frontiers Res. Metrics Anal., 2020

Automating Document Classification with Distant Supervision to Increase the Efficiency of Systematic Reviews.
CoRR, 2020

Extractive Summarizer for Scholarly Articles.
CoRR, 2020

Provably Stable Interpretable Encodings of Context Free Grammars in RNNs with a Differentiable Stack.
CoRR, 2020

CODA-19: Reliably Annotating Research Aspects on 10,000+ CORD-19 Abstracts Using a Non-Expert Crowd.
CoRR, 2020

Reducing the Computational Burden of Deep Learning with Recursive Local Representation Alignment.
CoRR, 2020

Deep Learning, Grammar Transfer, and Transportation Theory.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2020

Keyphrase Extraction in Scholarly Digital Library Search Engines.
Proceedings of the Web Services - ICWS 2020, 2020

Follow The Curve: Arbitrarily Oriented Scene Text Detection Using Key Points Spotting And Curve Prediction.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

Acknowledgement Entity Recognition in CORD-19 Papers.
Proceedings of the First Workshop on Scholarly Document Processing, 2020

Learning CNF Blocking for Large-scale Author Name Disambiguation.
Proceedings of the First Workshop on Scholarly Document Processing, 2020

Accelerating Substructure Similarity Search for Formula Retrieval.
Proceedings of the Advances in Information Retrieval, 2020

COVIDSeer: Extending the CORD-19 Dataset.
Proceedings of the DocEng '20: ACM Symposium on Document Engineering 2020, Virtual Event, CA, USA, September 29, 2020

The Sibling Neural Estimator: Improving Iterative Image Decoding with Gradient Communication.
Proceedings of the Data Compression Conference, 2020

PSU at CLEF-2020 ARQMath Track: Unsupervised Re-ranking using Pretraining.
Proceedings of the Working Notes of CLEF 2020, 2020

Modeling Updates of Scholarly Webpages Using Archived Data.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

Adversarial Models for Deterministic Finite Automata.
Proceedings of the Advances in Artificial Intelligence, 2020

Automatic Generation of Headlines for Online Math Questions.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Query Auto Completion for Math Formula Search.
CoRR, 2019

Sibling Neural Estimators: Improving Iterative Image Decoding with Gradient Communication.
CoRR, 2019

Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata.
CoRR, 2019

Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively.
CoRR, 2019

Sec-Lib: Protecting Scholarly Digital Libraries From Infected Papers Using Active Machine Learning Framework.
IEEE Access, 2019

Bi-LSTM-CRF Sequence Labeling for Keyphrase Extraction from Scholarly Documents.
Proceedings of the World Wide Web Conference, 2019

TextContourNet: A Flexible and Effective Framework for Improving Scene Text Detection Architecture With a Multi-Task Cascade.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Automatic Slide Generation for Scientific Papers.
Proceedings of the Third International Workshop on Capturing Scientific Knowledge co-located with the 10th International Conference on Knowledge Capture (K-CAP 2019), 2019

Tangent-CFT: An Embedding Model for Mathematical Formulas.
Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019

Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations.
Proceedings of the 12th International Conference on Educational Data Mining, 2019

Learned Neural Iterative Decoding for Lossy Image Compression Systems.
Proceedings of the Data Compression Conference, 2019

A Neural Temporal Model for Human Motion Prediction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Hybrid Deep Pairwise Classification for Author Name Disambiguation.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

A Learning-based Text Synthesis Engine for Scene Text Detection.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

BBookX: Creating Semi-Automated Textbooks to Support Student Learning and Decrease Student Costs.
Proceedings of the First Workshop on Intelligent Textbooks co-located with 20th International Conference on Artificial Intelligence in Education (AIED 2019), 2019

CiteSeerX: 20 years of service to scholarly big data.
Proceedings of the Conference on Artificial Intelligence for Data Discovery and Reuse, 2019

Adversarial Training for Community Question Answer Selection Based on Multi-Scale Matching.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Cleaning Noisy and Heterogeneous Metadata for Record Linking across Scholarly Big Datasets.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
An Empirical Evaluation of Rule Extraction from Recurrent Neural Networks.
Neural Comput., 2018

Verification of Recurrent Neural Networks Through Rule Extraction.
CoRR, 2018

Online Learning of Recurrent Neural Architectures by Locally Aligning Distributed Representations.
CoRR, 2018

Guided Attention for Large Scale Scene Text Verification.
CoRR, 2018

Learned Iterative Decoding for Lossy Image Compression Systems.
CoRR, 2018

Conducting Credit Assignment by Aligning Local Representations.
CoRR, 2018

A Comparison of Rule Extraction for Different Recurrent Neural Network Models and Grammatical Complexity.
CoRR, 2018

Extracting Semantic Relations for Scholarly Knowledge Base Construction.
Proceedings of the 12th IEEE International Conference on Semantic Computing, 2018

Text extraction and retrieval from smartphone screenshots: building a repository for life in media.
Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 2018

A Web Service for Author Name Disambiguation in Scholarly Databases.
Proceedings of the 2018 IEEE International Conference on Web Services, 2018

Defending Against Adversarial Samples Without Security through Obscurity.
Proceedings of the IEEE International Conference on Data Mining, 2018

CiteSeerX-2018: A Cleansed Multidisciplinary Scholarly Big Dataset.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

Distractor Generation for Multiple Choice Questions Using Learning to Rank.
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications@NAACL-HLT 2018, 2018

Large Scale Scene Text Verification with Guided Attention.
Proceedings of the Computer Vision - ACCV 2018, 2018

Investigating Active Learning for Concept Prerequisite Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning a Hierarchical Latent-Variable Model of 3D Shapes.
Proceedings of the 2018 International Conference on 3D Vision, 2018

2017
Unifying Adversarial Training Algorithms with Data Gradient Regularization.
Neural Comput., 2017

How are you feeling?: A personalized methodology for predicting mental states from temporally observable physical and behavioral information.
J. Biomed. Informatics, 2017

Learning to Adapt by Minimizing Discrepancy.
CoRR, 2017

The Neural Network Pushdown Automaton: Model, Stack and Learning Simulations.
CoRR, 2017

An Empirical Evaluation of Recurrent Neural Network Rule Extraction.
CoRR, 2017

Scaling Author Name Disambiguation with CNF Blocking.
CoRR, 2017

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Network.
CoRR, 2017

Learning a Hierarchical Latent-Variable Model of Voxelized 3D Shapes.
CoRR, 2017

Scholarly Digital Libraries as a Platform for Malware Distribution.
Proceedings of the A Systems Approach to Cyber Security, 2017

Adversary Resistant Deep Neural Networks with an Application to Malware Detection.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

A Supervised Learning Approach To Entity Matching Between Scholarly Big Datasets.
Proceedings of the Knowledge Capture Conference, 2017

Distractor Generation with Generative Adversarial Nets for Automatically Creating Fill-in-the-blank Questions.
Proceedings of the Knowledge Capture Conference, 2017

Text Extraction from Smartphone Screenshots to Archive in situ Media Behavior.
Proceedings of the Knowledge Capture Conference, 2017

Smart Library: Identifying Books on Library Shelves Using Supervised Deep Learning for Scene Text Reading.
Proceedings of the 2017 ACM/IEEE Joint Conference on Digital Libraries, 2017

HESDK: A Hybrid Approach to Extracting Scientific Domain Knowledge Entities.
Proceedings of the 2017 ACM/IEEE Joint Conference on Digital Libraries, 2017

Learning to Read Irregular Text with Attention Mechanisms.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Improving Offline Handwritten Chinese Character Recognition by Iterative Refinement.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Multi-Scale Multi-Task FCN for Semantic Page Segmentation and Table Detection.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Compiling Keyphrase Candidates for Scientific Literature Based on Wikipedia.
Proceedings of the Joint Proceedings of the 1st Workshop on Temporal Dynamics in Digital Libraries (TDDL 2017), 2017

Automatic Knowledge Base Construction from Scholarly Documents.
Proceedings of the 2017 ACM Symposium on Document Engineering, 2017

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Recovering Concept Prerequisite Relations from University Course Dependencies.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

A Machine Learning Approach for Semantic Structuring of Scientific Charts in Scholarly Documents.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
AlgorithmSeer: A System for Extracting and Searching for Algorithms in Scholarly Big Data.
IEEE Trans. Big Data, 2016

Smart Library: Identifying Books in a Library using Richly Supervised Deep Scene Text Reading.
CoRR, 2016

Random Feature Nullification for Adversary Resistant Deep Architecture.
CoRR, 2016

Learning Adversary-Resistant Deep Neural Networks.
CoRR, 2016

Using Non-invertible Data Transformations to Build Adversary-Resistant Deep Neural Networks.
CoRR, 2016

Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization.
CoRR, 2016

Random Forest DBSCAN for USPTO Inventor Name Disambiguation.
CoRR, 2016

Reports of the 2016 AAAI Workshop Program.
AI Mag., 2016

BBookX: Design of an Automated Web-based Recommender System for the Creation of Open Learning Content.
Proceedings of the 25th International Conference on World Wide Web, 2016

Scholarly Big Data Knowledge and Semantics.
Proceedings of the 25th International Conference on World Wide Web, 2016

CiteSeerX data: semanticizing scholarly papers.
Proceedings of the International Workshop on Semantic Big Data, 2016

Scalable algorithms for scholarly figure mining and semantics.
Proceedings of the International Workshop on Semantic Big Data, 2016

Detecting Arbitrary Oriented Text in the Wild with a Visual Attention Model.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Information Extraction for Scholarly Digital Libraries.
Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Improving Similar Document Retrieval Using a Recursive Pseudo Relevance Feedback Strategy.
Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Inventor Name Disambiguation for a Patent Database Using a Random Forest and DBSCAN.
Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Towards Better Understanding of Academic Search.
Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Curve Separation for Line Graphs in Scholarly Documents.
Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, 2016

Using Prerequisites to Extract Concept Maps fromTextbooks.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

Financial Entity Record Linkage with Random Forests.
Proceedings of the Second International Workshop on Data Science for Macro-Modeling, 2016

Aggregating Local Context for Accurate Scene Text Detection.
Proceedings of the Computer Vision - ACCV 2016, 2016

Exploring Multiple Feature Spaces for Novel Entity Discovery.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

BBookX: Building Online Open Books for Personalized Learning.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Document Type Classification in Online Digital Libraries.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Automatic Summary Generation for Scientific Data Charts.
Proceedings of the Scholarly Big Data: AI Perspectives, 2016

2015
Improving Researcher Homepage Classification with Unlabeled Data.
ACM Trans. Web, 2015

ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank.
ACM Trans. Knowl. Discov. Data, 2015

Digital Library and Archiving for Qatar.
Bull. IEEE Tech. Comm. Digit. Libr., 2015

A generalized topic modeling approach for automatic document annotation.
Int. J. Digit. Libr., 2015

The CHEMDNER corpus of chemicals and drugs and its annotation principles.
J. Cheminformatics, 2015

Chemical entity extraction using CRF and an ensemble of extractors.
J. Cheminformatics, 2015

Online Semi-Supervised Learning with Deep Hybrid Boltzmann Machines and Denoising Autoencoders.
CoRR, 2015

ExpertSeer: a Keyphrase Based Expert Recommender for Digital Libraries.
CoRR, 2015

CiteSeerX: AI in a Digital Library Search Engine.
AI Mag., 2015

Big Scholarly Data in CiteSeerX: Information Extraction from the Web.
Proceedings of the 24th International Conference on World Wide Web Companion, 2015

An Architecture for Information Extraction from Figures in Digital Libraries.
Proceedings of the 24th International Conference on World Wide Web Companion, 2015

On the Use of Similarity Search to Detect Fake Scientific Papers.
Proceedings of the Similarity Search and Applications - 8th International Conference, 2015

Updating Graph Indices with a One-Pass Algorithm.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31, 2015

Online Learning of Deep Hybrid Architectures for Semi-supervised Categorization.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2015

PDFMEF: A Multi-Entity Knowledge Extraction Framework for Scholarly Documents and Semantic Search.
Proceedings of the 8th International Conference on Knowledge Capture, 2015

Automatic Extraction of Data from Bar Charts.
Proceedings of the 8th International Conference on Knowledge Capture, 2015

Automatically Generating a Concept Hierarchy with Graphs.
Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries, 2015

Online Person Name Disambiguation with Constraints.
Proceedings of the 15th ACM/IEEE-CE Joint Conference on Digital Libraries, 2015

A hybrid approach to discover semantic hierarchical sections in scholarly documents.
Proceedings of the 13th International Conference on Document Analysis and Recognition, 2015

Learning a Deep Hybrid Model for Semi-Supervised Text Classification.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Measuring Prerequisite Relations Among Concepts.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Concept Hierarchy Extraction from Textbooks.
Proceedings of the 2015 ACM Symposium on Document Engineering, 2015

BBookX: An Automatic Book Creation Framework.
Proceedings of the 2015 ACM Symposium on Document Engineering, 2015

Automatic Extraction of Figures from Scholarly Documents.
Proceedings of the 2015 ACM Symposium on Document Engineering, 2015

Storybase: Towards Building a Knowledge Base for News Events.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

Sense-Aaware Semantic Analysis: A Multi-Prototype Word Representation Model Using Wikipedia.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

A Neural Probabilistic Model for Context Based Citation Recommendation.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
"Building a search engine for algorithms" by Suppawong Tuarob, Prasenjit Mitra, and C. Lee Giles with Martin Vesely as coordinator.
SIGWEB Newsl., 2014

Science and Ethnicity: How Ethnicities Shape the Evolution of Computer Science Research Community.
CoRR, 2014

Extracting Researcher Metadata with Labeled Features.
Proceedings of the 2014 SIAM International Conference on Data Mining, 2014

Document Analysis and Retrieval Tasks in Scientific Digital Libraries.
Proceedings of the Information Retrieval, 2014

Towards building a scholarly big data platform: Challenges, lessons and opportunities.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

Crowd-sourcing Web knowledge for metadata extraction.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

RefSeer: A citation recommendation system.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

The feasibility of investing in manual correction of metadata for a large-scale digital library.
Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, 2014

A Web Service for Scholarly Big Data Information Extraction.
Proceedings of the 2014 IEEE International Conference on Web Services, 2014

Scholarly big data information extraction and integration in the CiteSeer<sup>χ</sup> digital library.
Proceedings of the Workshops Proceedings of the 30th International Conference on Data Engineering Workshops, 2014

Migrating a Digital Library to a Private Cloud.
Proceedings of the 2014 IEEE International Conference on Cloud Engineering, 2014

Utility-Based Control Feedback in a Digital Library Search Engine: Cases in CiteSeerX.
Proceedings of the 9th International Workshop on Feedback Computing, 2014

CiteSeer x : A Scholarly Big Dataset.
Proceedings of the Advances in Information Retrieval, 2014

SimSeerX: a similar document search engine.
Proceedings of the ACM Symposium on Document Engineering 2014, 2014

Classifying and ranking search engine results as potential sources of plagiarism.
Proceedings of the ACM Symposium on Document Engineering 2014, 2014

The impact of user corrections on a crawl-based digital library: A CiteSeerX perspective.
Proceedings of the 10th IEEE International Conference on Collaborative Computing: Networking, 2014

Supervised Ranking for Plagiarism Source Retrieval.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

Large scale author name disambiguation in digital libraries.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

CiteSeerX: AI in a Digital Library Search Engine.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Mining and Indexing Graphs for Supergraph Search.
Proc. VLDB Endow., 2013

Graph-based Approach to Automatic Taxonomy Generation (GraBTax).
CoRR, 2013

Researcher homepage classification using unlabeled data.
Proceedings of the 22nd International World Wide Web Conference, 2013

Measuring Term Informativeness in Context.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Automatic tag recommendation for metadata annotation using probabilistic topic modeling.
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 2013

A classification scheme for algorithm citation function in scholarly works.
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 2013

Ranking experts using author-document-topic graphs.
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 2013

A figure search engine architecture for a chemistry digital library.
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 2013

CSSeer: an expert recommendation system based on CiteseerX.
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 2013

Can't see the forest for the trees?: a citation recommendation system.
Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, 2013

Towards the Discovery of Diseases Related by Genes Using Vertex Similarity Measures.
Proceedings of the IEEE International Conference on Healthcare Informatics, 2013

Table of Contents Recognition and Extraction for Heterogeneous Book Documents.
Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013

Automatic Detection of Pseudocodes in Scholarly Documents Using Machine Learning.
Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013

Figure Metadata Extraction from Digital Documents.
Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013

Scaling SeerSuite in the Cloud.
Proceedings of the 2013 IEEE International Conference on Cloud Engineering, 2013

Searching online book documents and analyzing book citations.
Proceedings of the ACM Symposium on Document Engineering 2013, 2013

Near duplicate detection in an academic digital library.
Proceedings of the ACM Symposium on Document Engineering 2013, 2013

The predictive value of young and old links in a social network.
Proceedings of the 3rd ACM SIGMOD Workshop on Databases and Social Networks, 2013

Unsupervised Ranking for Plagiarism Source Retrieval Notebook for PAN at CLEF 2013.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

Can back-of-the-book indexes be automatically created?
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Scholarly big data: information extraction and data mining.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2013 international workshop on computational scientometrics: theory and applications.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Channeling the deluge: research challenges for big data and information systems.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

ASCOS: an asymmetric network structure COntext similarity measure.
Proceedings of the Advances in Social Networks Analysis and Mining 2013, 2013

2012
Neural Network Classification and Prior Class Probabilities.
Proceedings of the Neural Networks: Tricks of the Trade - Second Edition, 2012

Specialized Research Datasets in the CiteSeer<sup>x</sup> Digital Library.
D Lib Mag., 2012

Web crawler middleware for search engine digital libraries: a case study for citeseerX.
Proceedings of the Twelfth International Workshop on Web Information and Data Management, 2012

The evolution of a crawling strategy for an academic document search engine: whitelists and blacklists.
Proceedings of the Web Science 2012, 2012

A Framework for Bridging the Gap Between Open Source Search Tools.
Proceedings of the SIGIR 2012 Workshop on Open Source Information Retrieval, 2012

Towards Building and Analyzing a Social Network of Acknowledgments in Scientific and Academic Documents.
Proceedings of the Social Computing, Behavioral - Cultural Modeling and Prediction, 2012

Predicting Recent Links in FOAF Networks.
Proceedings of the Social Computing, Behavioral - Cultural Modeling and Prediction, 2012

Discovering missing links in networks using vertex similarity measures.
Proceedings of the ACM Symposium on Applied Computing, 2012

Improving algorithm search using the algorithm co-citation network.
Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

A system for indexing tables, algorithms and figures.
Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

AckSeer: a repository and search engine for automatically extracted acknowledgments from digital libraries.
Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

Similar researcher search in academic environments.
Proceedings of the 12th ACM/IEEE-CS Joint Conference on Digital Libraries, 2012

Iterative Graph Feature Mining for Graph Indexing.
Proceedings of the IEEE 28th International Conference on Data Engineering (ICDE 2012), 2012

Phrase Pair Classification for Identifying Subtopics.
Proceedings of the Advances in Information Retrieval, 2012

Entity resolution using search engine results.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Recommending citations: translating papers into references.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

Table Header Detection and Classification.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Automatic tag recommendation algorithms for social recommender systems.
ACM Trans. Web, 2011

Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents.
ACM Trans. Inf. Syst., 2011

Nonconvex Online Support Vector Machines.
IEEE Trans. Pattern Anal. Mach. Intell., 2011

ChemXSeer Digital Library Gaussian Search
CoRR, 2011

Citation recommendation without author supervision.
Proceedings of the Forth International Conference on Web Search and Web Data Mining, 2011

Capturing missing edges in social networks using vertex similarity.
Proceedings of the 6th International Conference on Knowledge Capture (K-CAP 2011), 2011

Ranking authors in digital libraries.
Proceedings of the 2011 Joint International Conference on Digital Libraries, 2011

On identifying academic homepages for digital libraries.
Proceedings of the 2011 Joint International Conference on Digital Libraries, 2011

CollabSeer: a search engine for collaboration discovery.
Proceedings of the 2011 Joint International Conference on Digital Libraries, 2011

Classifying text messages for the haiti earthquake.
Proceedings of the 8th Proceedings of the International Conference on Information Systems for Crisis Response and Management, 2011

Context Sensitive Topic Models for Author Influence in Document Networks.
Proceedings of the IJCAI 2011, 2011

Watershed Reanalysis: Towards a National Strategy for Model-Data Integration.
Proceedings of the IEEE 7th International Conference on E-Science, 2011

2010
Locality and attachedness-based temporal social network growth dynamics analysis: A case study of evolving nanotechnology scientific collaboration networks.
J. Assoc. Inf. Sci. Technol., 2010

Estimating the web robot population.
Proceedings of the 19th International Conference on World Wide Web, 2010

Exploring web scale language models for search query processing.
Proceedings of the 19th International Conference on World Wide Web, 2010

Context-aware citation recommendation.
Proceedings of the 19th International Conference on World Wide Web, 2010

SNDocRank: document ranking based on social networks.
Proceedings of the 19th International Conference on World Wide Web, 2010

Measuring the web crawler ethics.
Proceedings of the 19th International Conference on World Wide Web, 2010

Finding algorithms in scientific articles.
Proceedings of the 19th International Conference on World Wide Web, 2010

The Ethicality of Web Crawlers.
Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence, 2010

SeerSuite: Developing a Scalable and Reliable Application Framework for Building Digital Libraries by Crawling the Web.
Proceedings of the USENIX Conference on Web Application Development, 2010

Personalized Feed Recommendation Service for Social Networks.
Proceedings of the 2010 IEEE Second International Conference on Social Computing, 2010

SEERLAB: A System for Extracting Keyphrases from Scholarly Documents.
Proceedings of the 5th International Workshop on Semantic Evaluation, 2010

Determining the sexual identities of prehistoric cave artists using digitized handprints: a machine learning approach.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

SNDocRank: a social network-based video search ranking framework.
Proceedings of the 11th ACM SIGMM International Conference on Multimedia Information Retrieval, 2010

oreChem ChemXSeer: a semantic digital library for chemistry.
Proceedings of the 2010 Joint International Conference on Digital Libraries, 2010

Social network document ranking.
Proceedings of the 2010 Joint International Conference on Digital Libraries, 2010

CiteSeerx: A Cloud Perspective.
Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing, 2010

Enhancing Cross Document Coreference of Web Documents with Context Similarity and Very Large Scale Text Categorization.
Proceedings of the COLING 2010, 2010

Cloud Computing: A Digital Libraries Perspective.
Proceedings of the IEEE International Conference on Cloud Computing, 2010

2009
Better Naive Bayes classification for high-precision spam detection.
Softw. Pract. Exp., 2009

Designing for e-science: Requirements gathering for collaboration in CiteSeer.
Int. J. Hum. Comput. Stud., 2009

Automated analysis of images in documents for intelligent document search.
Int. J. Document Anal. Recognit., 2009

Graph based crawler seed selection.
Proceedings of the 18th International Conference on World Wide Web, 2009

Pairwise Constrained Clustering for Sparse and High Dimensional Feature Spaces.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2009

Solving the "Who's Mark Johnson Puzzle": Information Extraction Based Cross Document Coreference.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

MobiSNA: a mobile video social network application.
Proceedings of the Eighth ACM International Workshop on Data Engineering for Wireless and Mobile Access, 2009

Disambiguating authors in academic publications using random forests.
Proceedings of the 2009 Joint International Conference on Digital Libraries, 2009

Finding topic trends in digital libraries.
Proceedings of the 2009 Joint International Conference on Digital Libraries, 2009

Improving the Table Boundary Detection in PDFs by Fixing the Sequence Error of the Sparse Lines.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

Effectively Searching Maps in Web Documents.
Proceedings of the Advances in Information Retrieval, 2009

Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation.
Proceedings of the Advances in Information Retrieval, 2009

Efficient record-level wrapper induction.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Graph-based seed selection for web-scale crawlers.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Learning to rank graphs for online similar graph search.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Independent informative subgraph mining for graph information retrieval.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Detecting topic evolution in scientific literature: how can citations help?
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering.
Proceedings of the ACL 2009, 2009

2008
SNAKDD 2008 social network mining and analysis postworkshop report.
SIGKDD Explor., 2008

Workload analysis for scientific literature digital libraries.
Int. J. Digit. Libr., 2008

Design and evaluation of awareness mechanisms in CiteSeer.
Inf. Process. Manag., 2008

Automatic Identification and Data Extraction from 2-Dimensional Plots in Digital Documents
CoRR, 2008

Modeling and visualizing geo-sensitive queries based on user clicks.
Proceedings of the First International Workshop on Location and the Web, 2008

Learning multiple graphs for document recommendations.
Proceedings of the 17th International Conference on World Wide Web, 2008

Exploring social annotations for information retrieval.
Proceedings of the 17th International Conference on World Wide Web, 2008

Mining, indexing, and searching for textual chemical molecule information on the web.
Proceedings of the 17th International Conference on World Wide Web, 2008

Collaboration over time: characterizing and modeling network evolution.
Proceedings of the International Conference on Web Search and Web Data Mining, 2008

Personalized ranking for digital libraries based on log analysis.
Proceedings of the 10th ACM International Workshop on Web Information and Data Management (WIDM 2008), 2008

Towards Click-Based Models of Geographic Interests in Web Search.
Proceedings of the 2008 IEEE / WIC / ACM International Conference on Web Intelligence, 2008

Real-time automatic tag recommendation.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

Finding a Haystack in Haystacks - Simultaneous Identification of Concepts in Large Bio-Medical Corpora.
Proceedings of the SIAM International Conference on Data Mining, 2008

ParsCit: an Open-source CRF Reference String Parsing Package.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

A metadata generation system for scanned scientific volumes.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2008

Segregating and extracting overlapping data points in two-dimensional plots.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2008

BotSeer: An Automated Information System for Analyzing Web Robots.
Proceedings of the Eighth International Conference on Web Engineering, 2008

Efficient user preference predictions using collaborative filtering.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

A Non-parametric Approach to Pair-Wise Dynamic Topic Correlation Detection.
Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), 2008

A Fast Preprocessing Method for Table Boundary Detection: Narrowing Down the Sparse Lines Using Solely Coordinate Information.
Proceedings of the Eighth IAPR International Workshop on Document Analysis Systems, 2008

Metadata extraction and indexing for map search in web documents.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Measuring user preference changes in digital libraries.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

A sparse gaussian processes classification framework for fast tag suggestions.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Identifying table boundaries in digital documents via sparse line detection.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Real-time data pre-processing technique for efficient feature extraction in large scale datasets.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Scalable community discovery on textual data with relations.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Error-driven generalist+experts (edge): a multi-stage ensemble framework for text categorization.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

CiteSense: supporting sensemaking of research literature.
Proceedings of the 2008 Conference on Human Factors in Computing Systems, 2008

Block-Suffix Shifting: Fast, Simultaneous Medical Concept Set Identification in Large Medical Record Corpora.
Proceedings of the AMIA 2008, 2008

Scientific Data and Document Processing in ChemxSeer.
Proceedings of the Semantic Scientific Knowledge Integration, 2008

Automatic Extraction of Data Points and Text Blocks from 2-Dimensional Plots in Digital Documents.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

Hierarchical Location and Topic Based Query Expansion.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
Automatic Extraction of Table Metadata from Digital Documents.
Bull. IEEE Tech. Comm. Digit. Libr., 2007

WebKDD/SNAKDD 2007: web mining and social network analysis post-workshop report.
SIGKDD Explor., 2007

Social Bookmarking for Scholarly Digital Libraries.
IEEE Internet Comput., 2007

Group-Linking Method: A Unified Benchmark for Machine Learning with Recurrent Neural Network.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2007

Are your citations clean?
Commun. ACM, 2007

Image annotation by hierarchical mapping of features.
Proceedings of the 16th International Conference on World Wide Web, 2007

Designing efficient sampling techniques to detect webpage updates.
Proceedings of the 16th International Conference on World Wide Web, 2007

A large-scale study of robots.txt.
Proceedings of the 16th International Conference on World Wide Web, 2007

Extraction and search of chemical formulae in text documents on the web.
Proceedings of the 16th International Conference on World Wide Web, 2007

Generative models for name disambiguation.
Proceedings of the 16th International Conference on World Wide Web, 2007

Deriving knowledge from figures for digital libraries.
Proceedings of the 16th International Conference on World Wide Web, 2007

Automatic searching of tables in digital libraries.
Proceedings of the 16th International Conference on World Wide Web, 2007

A clustering method for web data with multi-type interrelated components.
Proceedings of the 16th International Conference on World Wide Web, 2007

Determining Bias to Search Engines from Robots.txt.
Proceedings of the 2007 IEEE / WIC / ACM International Conference on Web Intelligence, 2007

K-SVMeans: A Hybrid Clustering Algorithm for Multi-Type Interrelated Datasets.
Proceedings of the 2007 IEEE / WIC / ACM International Conference on Web Intelligence, 2007

A clustering-based sampling approach for refreshing search engine's database.
Proceedings of the Tenth International Workshop on the Web and Databases, 2007

Topic segmentation with shared topic detection and alignment of multiple documents.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

Active learning for class imbalance problem.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

Intelligent Parsing of Scanned Volumes for Web Based Archives.
Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007

Efficient Multiclass Boosting Classification with Active Learning.
Proceedings of the Seventh SIAM International Conference on Data Mining, 2007

IKNN: Informative K-Nearest Neighbor Pattern Classification.
Proceedings of the Knowledge Discovery in Databases: PKDD 2007, 2007

Detecting research topics via the correlation between graphs and texts.
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007

Measuring conference quality by mining program committee characteristics.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

Adaptive sorted neighborhood methods for efficient record linkage.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

Efficient topic-based unsupervised name disambiguation.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

TableSeer: automatic table metadata extraction and searching in digital libraries.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

SearchGen: a synthetic workload generator for scientific literature digital libraries and search engines.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2007

An LDA-based Community Structure Discovery Approach for Large-Scale Social Networks.
Proceedings of the IEEE International Conference on Intelligence and Security Informatics, 2007

Learning User Clicks in Web Search.
Proceedings of the IJCAI 2007, 2007

Efficiently Detecting Webpage Updates Using Samples.
Proceedings of the Web Engineering, 7th International Conference, 2007

A Hybrid Cache and Prefetch Mechanism for Scientific Literature Search Engines.
Proceedings of the Web Engineering, 7th International Conference, 2007

Co-ranking Authors and Documents in a Heterogeneous Network.
Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

Discovering Temporal Communities from Social Network Documents.
Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

Extracting Author Meta-Data from Web Using Visual Features.
Proceedings of the Workshops Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

HSN-PAM: Finding Hierarchical Probabilistic Groups from Large-Scale Networks.
Proceedings of the Workshops Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

Query Expansion Using Topic and Location.
Proceedings of the Workshops Proceedings of the 7th IEEE International Conference on Data Mining (ICDM 2007), 2007

Automatic Extraction of Data from 2-D Plots in Documents.
Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007

Searching for Tables in Digital Documents.
Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007

Supporting distributed scientific collaboration: Implications for designing the CiteSeer collaboratory.
Proceedings of the 40th Hawaii International International Conference on Systems Science (HICSS-40 2007), 2007

Evaluating tagging behavior in social bookmarking systems: metrics and design heuristics.
Proceedings of the 2007 International ACM SIGGROUP Conference on Supporting Group Work, 2007

Popularity Weighted Ranking for Academic Digital Libraries.
Proceedings of the Advances in Information Retrieval, 2007

Designing clustering-based web crawling policies for search engine crawlers.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

ChemXSeer: a digital library and data repository for chemical kinetics.
Proceedings of the First Workshop on CyberInfrastructure: Information Management in eScience, 2007

Learning on the border: active learning in imbalanced data classification.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

Probabilistic Community Discovery Using Hierarchical Latent Gaussian Mixture Model.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

TableRank: A Ranking Algorithm for Table Search and Retrieval.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006
Probabilistic models for discovering e-communities.
Proceedings of the 15th international conference on World Wide Web, 2006

CiteSeerx: an architecture and web service design for an academic document search engine.
Proceedings of the 15th international conference on World Wide Web, 2006

An architecture for creating collaborative semantically capable scientific data sharing infrastructures.
Proceedings of the Eigth ACM International Workshop on Web Information and Data Management (WIDM 2006), 2006

Network Flow for Collaborative Ranking.
Proceedings of the Knowledge Discovery in Databases: PKDD 2006, 2006

Efficient Name Disambiguation for Large-Scale Databases.
Proceedings of the Knowledge Discovery in Databases: PKDD 2006, 2006

The Future of CiteSeer: CiteSeer<sup><i>x</i></sup>.
Proceedings of the Knowledge Discovery in Databases: PKDD 2006, 2006

Clustering Scientific Literature Using Sparse Citation Graph Analysis.
Proceedings of the Knowledge Discovery in Databases: PKDD 2006, 2006

Automatic categorization of figures in scientific documents.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2006

Learning metadata from the evidence in an on-line citation matching scheme.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2006

CiteSeer<sup>chi</sup>: a scalable autonomous scientific digital library.
Proceedings of the 1st International Conference on Scalable Information Systems, 2006

Boosting the Feature Space: Text Classification for Unstructured Data on the Web.
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006

Towards Next Generation CiteSeer: A Flexible Architecture for Digital Library Deployment.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2006

Topic evolution and social interactions: how authors effect research.
Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, 2006

Formal definitions of web information search.
Proceedings of the Information Realities: Shaping the Digital Future for All, 2006

2005
Automatic Identification of Informative Sections of Web Pages.
IEEE Trans. Knowl. Data Eng., 2005

Modeling the author bias between two on-line computer science citation databases.
Proceedings of the 14th international conference on World Wide Web, 2005

A hierarchical naive Bayes mixture model for name disambiguation in author citations.
Proceedings of the 2005 ACM Symposium on Applied Computing (SAC), 2005

Rule-based word clustering for document metadata extraction.
Proceedings of the 2005 ACM Symposium on Applied Computing (SAC), 2005

Automatic extraction of informative blocks from webpages.
Proceedings of the 2005 ACM Symposium on Applied Computing (SAC), 2005

Learning term-relationships for ontology construction: creating business ontologies for event explanation.
Proceedings of the 3rd International Conference on Knowledge Capture (K-CAP 2005), 2005

A learning based model for headline extraction of news articles to find explanatory sentences for events.
Proceedings of the 3rd International Conference on Knowledge Capture (K-CAP 2005), 2005

Automatic acknowledgement indexing: expanding the semantics of contribution in the CiteSeer digital library.
Proceedings of the 3rd International Conference on Knowledge Capture (K-CAP 2005), 2005

What's there and what's not?: focused crawling for missing documents in digital libraries.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2005

Name disambiguation in author citations using a K-way spectral clustering method.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2005

Identifying Content Blocks from Web Documents.
Proceedings of the Foundations of Intelligent Systems, 15th International Symposium, 2005

Finding Base Time-Line of a News Article.
Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, 2005

A Comparison of On-Line Computer Science Citation Databases.
Proceedings of the Research and Advanced Technology for Digital Libraries, 2005

Knowledge Discovery in Web-Directories: Finding Term-Relations to Build a Business Ontology.
Proceedings of the E-Commerce and Web Technologies: 6th International Conference, 2005

Recommender Systems for Intelligence Analysts.
Proceedings of the AI Technologies for Homeland Security, 2005

2004
Guest editorial: Machine learning for the Internet.
ACM Trans. Internet Techn., 2004

Analysis of lexical signatures for improving information persistence on the World Wide Web.
ACM Trans. Inf. Syst., 2004

Who gets acknowledged: Measuring scientific contributions through automatic acknowledgment indexing.
Proc. Natl. Acad. Sci. USA, 2004

Collaborative Filtering with Maximum Entropy.
IEEE Intell. Syst., 2004

Next generation CiteSeer.
Proceedings of the Sixth ACM CIKM International Workshop on Web Information and Data Management (WIDM 2004), 2004

Enabling interoperability for autonomous digital libraries: an API to citeseer services.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

Panorama: extending digital libraries with topical crawlers.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

Two supervised learning approaches for name disambiguation in author citations.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2004

Using CITIDEL to develop and share class plans.
Proceedings of the 9th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, 2004

Comparing static and dynamic measurements and models of the Internet's AS topology.
Proceedings of the Proceedings IEEE INFOCOM 2004, 2004

A service-oriented architecture for digital libraries.
Proceedings of the Service-Oriented Computing, 2004

CiteSeer-API: towards seamless resource location and interlinking for digital libraries.
Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, 2004

CiteSeer: Past, Present, and Future.
Proceedings of the Advances in Web Intelligence, 2004

Offering Collaborative-Like Recommendations When Data Is Sparse: The Case of Attraction-Weighted Information Filtering..
Proceedings of the Adaptive Hypermedia and Adaptive Web-Based Systems, 2004

2003
Search engine personalization: An exploratory study.
First Monday, 2003

A Caching Mechanism for Improving Internet based Mobile Ad Hoc Networks Performance.
Proceedings of the Twelfth International World Wide Web Conference - Posters, 2003

Classification of source code archives.
Proceedings of the SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28, 2003

Rule-based word clustering for text classification.
Proceedings of the SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28, 2003

eBizSearch: a niche search engine for e-business.
Proceedings of the SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28, 2003

Information incorporation in online in-Game sports betting markets.
Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

eBizSearch: An OAI-Compliant Digital Library for eBusiness.
Proceedings of the ACM/IEEE 2003 Joint Conference on Digital Libraries (JCDL 2003), 2003

Automatic Document Metadata Extraction Using Support Vector Machines.
Proceedings of the ACM/IEEE 2003 Joint Conference on Digital Libraries (JCDL 2003), 2003

Using an education oriented digital library to organize and present classes in computing and information.
Proceedings of the 8th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education, 2003

Static and Dynamic Analysis of the Internet's Susceptibility to Faults and Attacks.
Proceedings of the Proceedings IEEE INFOCOM 2003, The 22nd Annual Joint Conference of the IEEE Computer and Communications Societies, San Franciso, CA, USA, March 30, 2003

Enhancing distance learning using quality digital libraries and CITIDEL.
Proceedings of the Quality Education @ a Distance, 2003

Evolving Strategies for Focused Web Crawling.
Proceedings of the Machine Learning, 2003

Probabilistic User Behavior Models.
Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), 2003

Don't move: the T-Rex effect in the predator-prey world.
Proceedings of the Second International Joint Conference on Autonomous Agents & Multiagent Systems, 2003

2002
Winners don't take all: Characterizing the competition for links on the web
Proc. Natl. Acad. Sci. USA, 2002

Self-Organization and Identification of Web Communities.
Computer, 2002

Extracting query modifications from nonlinear SVMs.
Proceedings of the Eleventh International World Wide Web Conference, 2002

Learning Communication for Multi-agent Systems.
Proceedings of the Innovative Concepts for Agent-Based Systems, 2002

Modelling Information Incorporation in Markets, with Application to Detecting and Explaining Events.
Proceedings of the UAI '02, 2002

Analysis of lexical signatures for finding lost or related documents.
Proceedings of the SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002

What's the code?: automatic classification of source code archives.
Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002

Personlization of Search Engine Web Sites: All Bark, Little Bite.
Proceedings of the International Conference on Internet Computing, 2002

2001
Attractive Periodic Sets in Discrete-Time Recurrent Networks (with Emphasis on Fixed-Point Stability and Bifurcations in Two-Neuron Networks).
Neural Comput., 2001

Noisy Time Series Prediction using Recurrent Neural Networks and Grammatical Inference.
Mach. Learn., 2001

Scholarly publishing in the Internet age: a citation analysis of computer science literature.
Inf. Process. Manag., 2001

Sequence Learning: From Recognition and Prediction to Sequential Decision Making.
IEEE Intell. Syst., 2001

Persistence of Web References in Scientific Research.
Computer, 2001

Web Search - Your Way.
Commun. ACM, 2001

Improving Category Specific Web Search by Learning Query Modifications.
Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001), 2001

Feature Selection in Web Applications by ROC Inflections and Powerset Pruning.
Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001), 2001

Extracting collective probabilistic forecasts from web games.
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001

How communication can improve the performance of multi-agent systems.
Proceedings of the Fifth International Conference on Autonomous Agents, 2001

2000
Natural Language Grammatical Inference with Recurrent Neural Networks.
IEEE Trans. Knowl. Data Eng., 2000

Accessibility of information on the Web.
Intell., 2000

Learning Chaotic Attractors by Neural Networks.
Neural Comput., 2000

Discovering Relevant Scientific Literature on the Web.
IEEE Intell. Syst., 2000

Talking Helps: Evolving Communicating Agents for the Predator-Prey Pursuit Problem.
Artif. Life, 2000

Inquirus Web Meta-Search Tool: A User Evlaution Study.
Proceedings of WebNet 2000 - World Conference on the WWW and Internet, San Antonio, Texas, USA, October 30, 2000

Focused Crawling Using Context Graphs.
Proceedings of the VLDB 2000, 2000

Collaborative Filtering by Personality Diagnosis: A Hybrid Memory and Model-Based Approach.
Proceedings of the UAI '00: Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, Stanford University, Stanford, California, USA, June 30, 2000

Bayesian Classification and Feature Selection from Finite Data Sets.
Proceedings of the UAI '00: Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, Stanford University, Stanford, California, USA, June 30, 2000

Overfitting in Neural Nets: Backpropagation, Conjugate Gradient, and Early Stopping.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Efficient identification of Web communities.
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000

Overfitting and Neural Networks: Conjugate Gradient and Backpropagation.
Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 2000

A Normative Examination of Ensemble Learning Algorithms.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Self-Adaptive User Profiles for Large-Scale Data Delivery.
Proceedings of the 16th International Conference on Data Engineering, San Diego, California, USA, February 28, 2000

Persistence of information on the web: Analyzing citations contained in research articles.
Proceedings of the 2000 ACM CIKM International Conference on Information and Knowledge Management, 2000

DEADLINER: Building a New Niche Search Engine.
Proceedings of the 2000 ACM CIKM International Conference on Information and Knowledge Management, 2000

Clustering and Identifying Temporal Trends in Document Databases.
Proceedings of IEEE Advances in Digital Libraries 2000 (ADL 2000), 2000

Social Choice Theory and Recommender Systems: Analysis of the Axiomatic Foundations of Collaborative Filtering.
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

1999
Alternative discrete-time operators: an algorithm for optimal selection of parameters.
IEEE Trans. Signal Process., 1999

Equivalence in knowledge representation: automata, recurrent neural networks, and dynamical fuzzy systems.
Proc. IEEE, 1999

Digital Libraries and Autonomous Citation Indexing.
Computer, 1999

Searching the Web: general and scientific information access.
IEEE Commun. Mag., 1999

Text and Image Metasearch on the Web.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

Robust Learning of Chaotic Attractors.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

Distributed Error Correction.
Proceedings of the Fourth ACM conference on Digital Libraries, 1999

A System for Automatic Personalized Tracking of Scientific Literature on the Web.
Proceedings of the Fourth ACM conference on Digital Libraries, 1999

Indexing and Retrieval of Scientific Literature.
Proceedings of the 1999 ACM CIKM International Conference on Information and Knowledge Management, 1999

Architecture of a Metasearch Engine That Supports User Information Needs.
Proceedings of the 1999 ACM CIKM International Conference on Information and Knowledge Management, 1999

Searching the Web: Can You Find What You Want?
Proceedings of the 1999 ACM CIKM International Conference on Information and Knowledge Management, 1999

Autonomous Citation Matching.
Proceedings of the Third Annual Conference on Autonomous Agents, 1999

1998
Neural Networks And Hybrid Intelligent Models: Foundations, Theory, And Applications.
IEEE Trans. Neural Networks, 1998

Fuzzy finite-state automata can be deterministically encoded into recurrent neural networks.
IEEE Trans. Fuzzy Syst., 1998

How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies.
Neural Networks, 1998

Context and Page Analysis for Improved Web Search.
IEEE Internet Comput., 1998

Inquirus, the NECI Meta Search Engine.
Comput. Networks, 1998

Evaluating Answer Quality/Efficiency Tradeoffs.
Proceedings of the 5th International Workshop on Knowledge Represenation Meets Databases (KRDB '98): Innovative Application Programming and Query Interfaces, 1998

Fuzzy Knowledge and Recurrent Neural Networks: A Dynamical Systems Perspective.
Proceedings of the Hybrid Neural Systems, 1998

Reconfigurable Processor Architectures Exploiting High Bandwidth Optical Channels.
Proceedings of the 6th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM '98), 1998

CiteSeer: An Automatic Citation Indexing System.
Proceedings of the 3rd ACM International Conference on Digital Libraries, 1998

CiteSeer: An Autonous Web Agent for Automatic Retrieval and Identification of Interesting Publications.
Proceedings of the Second International Conference on Autonomous Agents, 1998

1997
A delay damage model selection algorithm for NARX neural networks.
IEEE Trans. Signal Process., 1997

Computational capabilities of recurrent NARX neural networks.
IEEE Trans. Syst. Man Cybern. Part B, 1997

Face recognition: a convolutional neural-network approach.
IEEE Trans. Neural Networks, 1997

On the distribution of performance from multiple neural-network trials.
IEEE Trans. Neural Networks, 1997

Time-delay neural networks: representation and induction of finite-state machines.
IEEE Trans. Neural Networks, 1997

The complexity of language recognition by neural networks.
Neurocomputing, 1997

The Neural Network Pushdown Automaton: Architecture, Dynamics and Training.
Proceedings of the Adaptive Processing of Sequences and Data Structures, 1997

Predicting Multiprocessor Memory Access Patterns with Learning Models.
Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), 1997

Rule inference for financial prediction using recurrent neural networks.
Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering, 1997

Lessons in Neural Network Training: Overfitting May be Harder than Expected.
Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, 1997

Intelligent Methods for File System Optimization.
Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, 1997

Presenting and Analyzing the Results of AI Experiments: Data Averaging and Data Snooping.
Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, 1997

1996
Learning long-term dependencies in NARX recurrent neural networks.
IEEE Trans. Neural Networks, 1996

An analysis of noise in recurrent neural networks: convergence and generalization.
IEEE Trans. Neural Networks, 1996

Rule Revision With Recurrent Neural Networks.
IEEE Trans. Knowl. Data Eng., 1996

Extraction of rules from discrete-time recurrent neural networks.
Neural Networks, 1996

Stable encoding of large finite-state automata in recurrent neural networks with sigmoid discriminants.
Neural Comput., 1996

Constructing Deterministic Finite-State Automata in Recurrent Neural Networks.
J. ACM, 1996

Neural Network Classification and Prior Class Probabilities.
Proceedings of the Neural Networks: Tricks of the Trade, 1996

Representation and Induction of Finite State Machines using Time-Delay Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

Online prediction of multiprocessor memory access patterns.
Proceedings of International Conference on Neural Networks (ICNN'96), 1996

Representation of fuzzy finite state automata in continuous recurrent, neural networks.
Proceedings of International Conference on Neural Networks (ICNN'96), 1996

Correctness, efficiency, extendability and maintainability in neural network simulation.
Proceedings of International Conference on Neural Networks (ICNN'96), 1996

Local minima and generalization.
Proceedings of International Conference on Neural Networks (ICNN'96), 1996

Can recurrent neural networks learn natural language grammars?
Proceedings of International Conference on Neural Networks (ICNN'96), 1996

Convolutional Neural Networks for Face Recognition.
Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96), 1996

1995
Constructive learning of recurrent neural networks: limitations of recurrent cascade correlation and a simple solution.
IEEE Trans. Neural Networks, 1995

Using recurrent neural networks to learn the structure of interconnection networks.
Neural Networks, 1995

Learning a class of large finite state machines with a recurrent neural network.
Neural Networks, 1995

What NARX Networks Can Compute.
Proceedings of the SOFSEM '95, 22nd Seminar on Current Trends in Theory and Practice of Informatics, Milovy, Czech Republic, November 23, 1995

Learning long-term dependencies is not as difficult with NARX networks.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

Natural language grammatical inference: a comparison of recurrent neural networks and machine learning methods.
Proceedings of the Connectionist, 1995

1994
First-order versus second-order single-layer recurrent neural networks.
IEEE Trans. Neural Networks, 1994

Pruning recurrent neural networks for improved generalization performance.
IEEE Trans. Neural Networks, 1994

Dynamic recurrent neural networks: Theory and applications.
IEEE Trans. Neural Networks Learn. Syst., 1994

Learning with Product Units.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

Effects of Noise on Convergence and Generalization in Recurrent Networks.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

An experimental comparison of recurrent neural networks.
Proceedings of the Advances in Neural Information Processing Systems 7, 1994

1993
Experimental Comparison of the Effect of Order in Recurrent Neural Networks.
Int. J. Pattern Recognit. Artif. Intell., 1993

Rule refinement with recurrent neural networks.
Proceedings of International Conference on Neural Networks (ICNN'88), San Francisco, CA, USA, March 28, 1993

Constructive learning of recurrent neural networks.
Proceedings of International Conference on Neural Networks (ICNN'88), San Francisco, CA, USA, March 28, 1993

1992
Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks.
Neural Comput., 1992

Routing in Random Multistage Interconnections Networks: Comparing Exhaustive Search, Greedy and Neural Network Approaches.
Int. J. Neural Syst., 1992

Using Prior Knowledge in a {NNPDA} to Learn Context-Free Languages.
Proceedings of the Advances in Neural Information Processing Systems 5, [NIPS Conference, Denver, Colorado, USA, November 30, 1992

The Complexity of Language Recognition by Neural Networks.
Proceedings of the Algorithms, Software, Architecture, 1992

Training Second-Order Recurrent Neural Networks using Hints.
Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), 1992

1991
Neural Network Routing for Random Multistage Interconnection Networks.
Proceedings of the Advances in Neural Information Processing Systems 4, 1991

Extracting and Learning an Unknown Grammar with Recurrent Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 4, 1991

1990
Recurrent neural networks, hidden Markov models and stochastic grammars.
Proceedings of the IJCNN 1990, 1990

1989
Higher Order Recurrent Networks and Grammatical Inference.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

1988
Computational advantages of higher order neural networks.
Neural Networks, 1988

1987
Encoding Geometric Invariances in Higher-Order Neural Networks.
Proceedings of the Neural Information Processing Systems, Denver, Colorado, USA, 1987, 1987

1986
Future directions in optical computing (panel session).
Proceedings of the 14th ACM Annual Conference on Computer Science, 1986

1985
Optical Adaptive Associative Computer Architectures.
Proceedings of the Spring COMPCON'85, 1985


  Loading...