Arnab Bhattacharya

Orcid: 0000-0001-7331-0788

  • Indian Institute of Technology (IIT), Kanpur
  • University of California, Santa Barbara

According to our database1, Arnab Bhattacharya authored at least 107 papers between 2004 and 2024.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:



VAIYAKARANA : A Benchmark for Automatic Grammar Correction in Bangla.
CoRR, 2024

PARAMANU-GANITA: Language Model with Mathematical Capabilities.
CoRR, 2024

PARAMANU-AYN: An Efficient Novel Generative and Instruction-tuned Language Model for Indian Legal Case Documents.
CoRR, 2024

Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models.
CoRR, 2024

A Likelihood Ratio Test of Genetic Relationship among Languages.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Automated Cognate Detection as a Supervised Link Prediction Task with Cognate Transformer.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Short-Term Fog Forecasting using Meteorological Observations at Airports in North India.
Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD), 2024

VeNoM: Approximate Subgraph Matching with Enhanced Neighbourhood Structural Information.
Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD), 2024

Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Framework for Question-Answering in Sanskrit through Automated Construction of Knowledge Graphs.
CoRR, 2023

Antarlekhaka: A Comprehensive Tool for Multi-task Natural Language Annotation.
CoRR, 2023

Comparative Analysis of Artificial Intelligence for Indian Legal Question Answering (AILQA) Using Different Retrieval and QA Models.
CoRR, 2023

Nonet at SemEval-2023 Task 6: Methodologies for Legal Evaluation.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

VACASPATI: A Diverse Corpus of Bangla Literature.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Cognate Transformer for Automated Phonological Reconstruction and Cognate Reflex Prediction.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Task and Model Agnostic Adversarial Attack on Graph Neural Networks.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Report on the 2nd Symposium on Artificial Intelligence and Law (SAIL) 2022.
SIGIR Forum, June, 2022

SpotSpam: Intention Analysis-driven SMS Spam Detection Using BERT Embeddings.
ACM Trans. Web, 2022

Chandojnanam: A Sanskrit Meter Identification and Utilization System.
CoRR, 2022

Semantic Annotation and Querying Framework based on Semi-structured Ayurvedic Text.
CoRR, 2022

Predictions of Reynolds and Nusselt numbers in turbulent convection using machine-learning models.
CoRR, 2022

nigam@COLIEE-22: Legal Case Retrieval and Entailment Using Cascading of Lexical and Semantic-Based Models.
Proceedings of the New Frontiers in Artificial Intelligence, 2022

Comparison of In-Situ FOG Observations with Insat-3D Satellite FOG Product for North Indian Cities.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2022

Prediction of CardioVascular Disease (CVD) using Ensemble Learning Algorithms.
Proceedings of the CODS-COMAD 2022: 5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD), Bangalore, India, January 8, 2022

HLDC: Hindi Legal Documents Corpus.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Semantic Segmentation of Legal Documents via Rhetorical Roles.
Proceedings of the Natural Legal Language Processing Workshop, 2022

TIPS: Mining Top-K Locations to Minimize User-Inconvenience for Trajectory-Aware Services.
IEEE Trans. Knowl. Data Eng., 2021

ILDC for CJPE: Indian Legal Documents Corpus for Court JudgmentPrediction and Explanation.
CoRR, 2021

Sangrahaka: a tool for annotating and querying knowledge graphs.
Proceedings of the ESEC/FSE '21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021

GraphReach: Position-Aware Graph Neural Network using Reachability Estimations.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Overview of the Third Shared Task on Artificial Intelligence for Legal Assistance at FIRE 2021.
Proceedings of the Working Notes of FIRE 2021, 2021

AILA 2021: Shared task on Artificial Intelligence for Legal Assistance.
Proceedings of the FIRE 2021: Forum for Information Retrieval Evaluation, Virtual Event, India, December 13, 2021

Exploring State-of-the-Art Nearest Neighbor (NN) Search Techniques.
Proceedings of the CODS-COMAD 2021: 8th ACM IKDD CODS and 26th COMAD, 2021

Computing and Maintaining Provenance of Query Result Probabilities in Uncertain Knowledge Graphs.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

VerSaChI: Finding Statistically Significant Subgraph Matches using Chebyshev's Inequality.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

ILDC for CJPE: Indian Legal Documents Corpus for Court Judgment Prediction and Explanation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

ChiSeL: Graph Similarity Search using Chi-Squared Statistics in Large Probabilistic Graphs.
Proc. VLDB Endow., 2020

GraphReach: Locality-Aware Graph Neural Networks using Reachability Estimations.
CoRR, 2020

FIRE 2020 AILA Track: Artificial Intelligence for Legal Assistance.
Proceedings of the FIRE 2020: Forum for Information Retrieval Evaluation, 2020

Overview of the FIRE 2020 AILA Track: Artificial Intelligence for Legal Assistance.
Proceedings of the Working Notes of FIRE 2020, 2020

How and Why is An Answer (Still) Correct? Maintaining Provenance in Dynamic Knowledge Graphs.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

RAQ: Relationship-Aware Graph Querying in Large Networks.
Proceedings of the World Wide Web Conference, 2019

GRADES-NDA 2019: Joint International Workshop on Graph Data Management Experiences & Systems and Network Data Analytics.
Proceedings of the 2019 International Conference on Management of Data, 2019

Overview of the FIRE 2019 AILA Track: Artificial Intelligence for Legal Assistance.
Proceedings of the Working Notes of FIRE 2019, 2019

FIRE 2019 AILA Track: Artificial Intelligence for Legal Assistance.
Proceedings of the FIRE '19: Forum for Information Retrieval Evaluation, 2019

NetClus: A Scalable Framework to Mine Top-K Locations for Placement of Trajectory-Aware Services.
Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 2019

Image Management for Biological Data.
Proceedings of the Encyclopedia of Database Systems, Second Edition, 2018

HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces.
Proc. VLDB Endow., 2018

Finding a largest rectangle inside a digital object and rectangularization.
J. Comput. Syst. Sci., 2018

Learning Multilingual Embeddings for Cross-Lingual Information Retrieval in the Presence of Topically Aligned Corpora.
CoRR, 2018

Identifying User Intent and Context in Graph Queries.
CoRR, 2018

Patterns for Indexing Large Datasets.
Proceedings of the 23rd European Conference on Pattern Languages of Programs, 2018

MineAr: using crowd knowledge for mining association rules in the health domain.
Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 2018

Finding shell company accounts using anomaly detection.
Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 2018

International Workshop on Legal Data Analytics and Mining (LeDAM 2018): Preface to the Proceedings.
Proceedings of the CIKM 2018 Workshops co-located with 27th ACM International Conference on Information and Knowledge Management (CIKM 2018), 2018

LDA Topic Modeling Based Dataset Dependency Matrix Prediction.
Proceedings of the Computational Intelligence, Communications, and Business Analytics, 2018

SkyGraph: Retrieving Regions of Interest using Skyline Subgraph Queries.
Proc. VLDB Endow., 2017

Neighbor-Aware Search for Approximate Labeled Graph Matching using the Chi-Square Statistics.
Proceedings of the 26th International Conference on World Wide Web, 2017

Automatic Grading and Feedback using Program Repair for Introductory Programming Courses.
Proceedings of the 2017 ACM Conference on Innovation and Technology in Computer Science Education, 2017

NetClus: A Scalable Framework for Locating Top-K Sites for Placement of Trajectory-Aware Services.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

K-Dominant Skyline Join Queries: Extending the Join Paradigm to K-Dominant Skylines.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Overview of the FIRE 2017 IRLeD Track: Information Retrieval from Legal Documents.
Proceedings of the Working notes of FIRE 2017, 2017

Optimal Algorithms for Min-Closed, Max-Closed and Arc Consistency over Connected Row Convex Constraints.
Proceedings of the 10th Annual ACM India Compute Conference, 2017

Stopword Removal: Why Bother? A Case Study on Verbose Queries.
Proceedings of the 10th Annual ACM India Compute Conference, 2017

Tracking the Impact of Fact Deletions on Knowledge Graph Queries using Provenance Polynomials.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

GARUDA: A System for Large-Scale Mining of Statistically Significant Connected Subgraphs.
Proc. VLDB Endow., 2016

SMS: Stable Matching Algorithm using Skylines.
Proceedings of the 28th International Conference on Scientific and Statistical Database Management, 2016

Finding Largest Rectangle Inside a Digital Object.
Proceedings of the Computational Topology in Image Context - 6th International Workshop, 2016

SkyCover: Finding Range-Constrained Approximate Skylines with Bounded Quality Guarantees.
Proceedings of the 21st International Conference on Management of Data, 2016

Probabilistic aggregate skyline join queries: skylines with aggregate operations over existentially uncertain relations.
Proceedings of the 27th International Conference on Scientific and Statistical Database Management, 2015

Generation of Random Triangular Digital Curves Using Combinatorial Techniques.
Proceedings of the Pattern Recognition and Machine Intelligence, 2015

Trajectory aware macro-cell planning for mobile users.
Proceedings of the 2015 IEEE Conference on Computer Communications, 2015

Mining Wikileaks Data to Identify Sentiment Polarities in International Relationships.
Proceedings of the Database Systems for Advanced Applications, 2015

Using social connections to improve collaborative filtering.
Proceedings of the Second ACM IKDD Conference on Data Sciences, 2015

Generation of Random Digital Curves Using Combinatorial Techniques.
Proceedings of the Algorithms and Discrete Applied Mathematics, 2015

Mining statistically significant connected subgraphs in vertex labeled graphs.
Proceedings of the International Conference on Management of Data, 2014

Efficient and Effective Route Planning in Road Networks with Probabilistic Data using Skyline Paths.
Proceedings of the 1st IKDD Conference on Data Sciences, Delhi, India, March 21 - 23, 2014, 2014

Emotion Recognition from Audio and Visual Data using F-score based Fusion.
Proceedings of the 1st IKDD Conference on Data Sciences, Delhi, India, March 21 - 23, 2014, 2014

Constraint Satisfaction over Generalized Staircase Constraints
CoRR, 2013

Evolution of the Modern Phase of Written Bangla: A Statistical Study.
Proceedings of the Mining Intelligence and Knowledge Exploration, 2013

Efficient edit distance based string similarity search using deletion neighborhoods.
Proceedings of the Joint 2013 EDBT/ICDT Conferences, 2013

RCached-tree: an index structure for efficiently answering popular queries.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

Mining Statistically Significant Substrings using the Chi-Square Statistic.
Proc. VLDB Endow., 2012

Hybrid HBase: Leveraging Flash SSDs to Improve Cost per Throughput of HBase.
Proceedings of the 18th International Conference on Management of Data, 2012

A Plant Identification System using Shape and Morphological Features on Segmented Leaflets: Team IITK, CLEF 2012.
Proceedings of the CLEF 2012 Evaluation Labs and Workshop, 2012

Finding the bias and prestige of nodes in networks based on trust scores.
Proceedings of the 20th International Conference on World Wide Web, 2011

A continuous query system for dynamic route planning.
Proceedings of the 27th International Conference on Data Engineering, 2011

Caching Stars in the Sky: A Semantic Caching Approach to Accelerate Skyline Queries.
Proceedings of the Database and Expert Systems Applications, 2011

Minimally Infrequent Itemset Mining using Pattern-Growth Paradigm and Residual Trees.
Proceedings of the 17th International Conference on Management of Data, 2011

Mining Statistically Significant Substrings Based on the Chi-Square Measure
CoRR, 2010

Finding Top-k Similar Pairs of Objects Annotated with Terms from an Ontology.
Proceedings of the Scientific and Statistical Database Management, 2010

Most Significant Substring Mining Based on Chi-square Measure.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2010

Querying spatial patterns.
Proceedings of the EDBT 2010, 2010

Minimum Spanning Tree on Spatio-Temporal Networks.
Proceedings of the Database and Expert Systems Applications, 21th International Conference, 2010

INSTRUCT - Space-Efficient Structure for Indexing and Complete Query Management of String Databases.
Proceedings of the 16th International Conference on Management of Data, 2010

Aggregate Skyline Join Queries: Skylines with Aggregate Operations over Multiple Relations.
Proceedings of the 16th International Conference on Management of Data, 2010

Image Management for Biological Data.
Proceedings of the Encyclopedia of Database Systems, 2009

Finding Significant Subregions in Large Image Databases
CoRR, 2009

On Low Distortion Embeddings of Statistical Distance Measures into Low Dimensional Spaces.
Proceedings of the Database and Expert Systems Applications, 20th International Conference, 2009

A general modeling and visualization tool for comparing different members of a group: application to studying tau-mediated regulation of microtubule dynamics.
BMC Bioinform., 2008

Efficient Computation of Statistical Significance of Query Results in Databases.
Proceedings of the Scientific and Statistical Database Management, 2008

MIST: Distributed Indexing and Querying in Sensor Networks using Statistical Models.
Proceedings of the 33rd International Conference on Very Large Data Bases, 2007

LB-Index: A Multi-Resolution Index Structure for Images.
Proceedings of the 22nd International Conference on Data Engineering, 2006

Indexing Spatially Sensitive Distance Measures Using Multi-resolution Lower Bounds.
Proceedings of the Advances in Database Technology, 2006

ViVo: Visual Vocabulary Construction for Mining Biomedical Images.
Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), 2005

Current Challenges in Bioimage Database Design.
Proceedings of the Fourth International IEEE Computer Society Computational Systems Bioinformatics Conference Workshops & Poster Abstracts, 2005

ProGreSS: Simultaneous Searching of Protein Databases by Sequence and Structure.
Proceedings of the Biocomputing 2004, 2004
