Tomás Lang

Javier Hormigo

IEEE Trans. Computers, 2012

Comments on 'improving the speed of decimal division'.

[BibT_eX]

[DOI]

IET Comput. Digit. Tech., 2012

2009

Division Unit for Binary Integer Decimals.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Application-Specific Systems, 2009

2007

A Radix-10 Digit-Recurrence Division Unit: Algorithm and Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2007

Improving the Throughput of On-line Addition for Data Streams.

[BibT_eX]

[DOI]

Javier Hormigo

Proceedings of the IEEE International Conference on Application-Specific Systems, 2007

2006

Double-Residue Modular Range Reduction for Floating-Point Hardware Implementations.

[BibT_eX]

[DOI]

Mario A. González

IEEE Trans. Computers, 2006

2005

High-Throughput CORDIC-Based Geometry Operations for 3D Computer Graphics.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2005

Digit-Recurrence Dividers with Reduced Logical Depth.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2005

Floating-Point Fused Multiply-Add: Reduced Latency for Floating-Point Addition.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE Symposium on Computer Arithmetic (ARITH-17 2005), 2005

Low Latency Digit-Recurrence Reciprocal and Square-Root Reciprocal Algorithm and Architecture.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE Symposium on Computer Arithmetic (ARITH-17 2005), 2005

2004

Floating-Point Multiply-Add-Fused with Reduced Latency.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2004

2003

Multilevel Reverse-Carry Addition: Single and Dual Adders.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 2003

Radix-4 Reciprocal Square-Root and Its Combination with Division and Square Root.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2003

Comments on "A carry-free 54 b×54 b multiplier using equivalent bit conversion algorithm".

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2003

2002

Floating-Point Fused Multiply-Add with Reduced Latency.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Computer Design (ICCD 2002), 2002

Fast Radix-4 Retimed Division with Selection by Comparisons.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE International Conference on Application-Specific Systems, 2002

2001

Multilevel reverse most-significant carry computation.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2001

Bounds on Runs of Zeros and Ones for Algebraic Functions.

[BibT_eX]

[DOI]

Jean-Michel Muller

Proceedings of the 15th IEEE Symposium on Computer Arithmetic (Arith-15 2001), 2001

Correctly Rounded Reciprocal Square-Root by Digit Recurrence and Radix-4 Implementation.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE Symposium on Computer Arithmetic (Arith-15 2001), 2001

Using the Reverse-Carry Approach for Double Datapath Floating-Point Addition.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE Symposium on Computer Arithmetic (Arith-15 2001), 2001

2000

CORDIC-Based Computation of ArcCos.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 2000

Very-High Radix CORDIC Rotation Based on Selection by Rounding.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 2000

Reciprocation, Square Root, Inverse Square Root, and Some Elementary Functions Using Small Multipliers.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2000

Very-High Radix Circular CORDIC: Vectoring and Unified Rotation/Vectoring.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2000

Multilevel Reverse-Carry Adder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference On Computer Design: VLSI In Computers & Processors, 2000

1999

Low-Power Divider.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1999

Very High Radix Square Root with Prescaling and Rounding and a Combined Division/Square Root Unit.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1999

Leading-One Prediction with Concurrent Position Correction.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1999

Low-Power Radix-4 Combined Division and Square Root.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference On Computer Design, 1999

Multilevel Reverse-Carry Computation for Comparison and for Sign and Overflow Detection in Addition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference On Computer Design, 1999

Low-Power Division: Comparison among Implementations of Radix 4, 8 and 16.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE Symposium on Computer Arithmetic (Arith-14 '99), 1999

Boosting Very-High Radix Division with Prescaling and Selection by Rounding.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE Symposium on Computer Arithmetic (Arith-14 '99), 1999

Very-High Radix CORDIC Vectoring with Scalings and Selection by Rounding.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE Symposium on Computer Arithmetic (Arith-14 '99), 1999

1998

Parallel Compensation of Scale Factor for the CORDIC Algorithm.

[BibT_eX]

[DOI]

Emilio L. Zapata

J. VLSI Signal Process., 1998

Working-zone encoding for reducing the energy in microprocessor address buses.

[BibT_eX]

[DOI]

Enric Musoll

IEEE Trans. Very Large Scale Integr. Syst., 1998

Computation of sqrt(x/d) in a Very High Radix Combined Division/Square-Root Unit with Scaling.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1998

Power-delay tradeoffs for radix-4 and radix-8 dividers.

[BibT_eX]

[DOI]

Proceedings of the 1998 International Symposium on Low Power Electronics and Design, 1998

Low-power radix-8 divider.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998

Extension of the working-zone-encoding method to reduce the energy on the microprocessor data bus.

[BibT_eX]

[DOI]

Enric Musoll

Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998

Leading-one prediction scheme for latency improvement in single datapath floating-point adders.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998

1997

Error Analysis and Reduction for Angle Calculation Using the CORDIC Algorithm.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1997

Exploiting the locality of memory references to reduce the address bus energy.

[BibT_eX]

[DOI]

Enric Musoll

Proceedings of the 1997 International Symposium on Low Power Electronics and Design, 1997

Reducing TLB power requirements.

[BibT_eX]

[DOI]

Toni Juan

Juan J. Navarro

Proceedings of the 1997 International Symposium on Low Power Electronics and Design, 1997

Low latency word serial CORDIC.

[BibT_eX]

[DOI]

Proceedings of the 1997 International Conference on Application-Specific Systems, 1997

CORDIC-based computation of arccos and arcsin.

[BibT_eX]

[DOI]

Proceedings of the 1997 International Conference on Application-Specific Systems, 1997

CORDIC Vectoring with Arbitrary Target Value.

[BibT_eX]

[DOI]

Proceedings of the 13th Symposium on Computer Arithmetic (ARITH-13 '97), 1997

1996

On recoding in arithmetic algorithms.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 1996

Cordic based parallel/pipelined architecture for the Hough transform.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 1996

Low-power radix-4 divider.

[BibT_eX]

[DOI]

Proceedings of the 1996 International Symposium on Low Power Electronics and Design, 1996

The Difference-bit Cache.

[BibT_eX]

[DOI]

Toni Juan

Juan J. Navarro

Proceedings of the 23rd Annual International Symposium on Computer Architecture, 1996

High Radix Cordic Rotation Based on Selection by Rounding.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par '96 Parallel Processing, 1996

1995

Conflict-Free Access for Streams in Multimodule Memories.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1995

Vector Multiprocessors with Arbitrated Memory Access.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual International Symposium on Computer Architecture, 1995

2-D DCT using on-line arithmetic.

[BibT_eX]

[DOI]

Proceedings of the 1995 International Conference on Acoustics, 1995

Very-high radix combined division and square root with prescaling and selection by rounding.

[BibT_eX]

[DOI]

Proceedings of the 12th Symposium on Computer Arithmetic (ARITH-12 '95), 1995

Sign detection and comparison networks with a small number of transitions.

[BibT_eX]

[DOI]

Proceedings of the 12th Symposium on Computer Arithmetic (ARITH-12 '95), 1995

1994

Very-High Radix Division with Prescaling and Selection by Rounding.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1994

High-Radix Division and Square-Root with Speculation.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1994

MOB forms: a class of multilevel block algorithms for dense linear algebra operations.

[BibT_eX]

[DOI]

Juan J. Navarro

Toni Juan

Proceedings of the 8th international conference on Supercomputing, 1994

1993

Introduction.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 1993

Conflict-free access to streams in multiprocessor systems.

[BibT_eX]

[DOI]

Microprocess. Microprogramming, 1993

Multiplication/ division/ square root module for massively parallel computers.

[BibT_eX]

[DOI]

Integr., 1993

Very high radix division with selection by rounding and prescaling.

[BibT_eX]

[DOI]

Proceedings of the 11th Symposium on Computer Arithmetic, 29 June, 1993

Division with speculation of quotient digits.

[BibT_eX]

[DOI]

Proceedings of the 11th Symposium on Computer Arithmetic, 29 June, 1993

1992

A method for implementation of one-dimensional systolic algorithms with data contraflow using pipelined functional units.

[BibT_eX]

[DOI]

J. VLSI Signal Process., 1992

Constant-Factor Redundant CORDIC for Angle Calculation and Rotation.

[BibT_eX]

[DOI]

Jeong-A Lee

IEEE Trans. Computers, 1992

Higher Radix Square Root with Prescaling.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1992

On-the-Fly Rounding.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1992

Architectural Support for Goal Management in Flat Concurrent Prolog.

[BibT_eX]

[DOI]

Leon Alkalaj

Computer, 1992

Increasing the Number of Strides for Conflict-Free Vector Access.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual International Symposium on Computer Architecture. Gold Coast, 1992

Conflict-free access of vectors with power-of-two strides.

[BibT_eX]

[DOI]

Mateo Valero

Eduard Ayguadé

Proceedings of the 6th international conference on Supercomputing, 1992

MAMACG: a tool for automatic mapping of matrix algorithms onto mesh array computational graphs.

[BibT_eX]

[DOI]

Proceedings of the Application Specific Array Processors, 1992

1991

Linear pseudosystolic array for partitioned matrix algorithms.

[BibT_eX]

[DOI]

Miguel E. Figueroa

J. VLSI Signal Process., 1991

Architectural Support for Reduced Register Saving / Restoring in Single-Window Register Files.

[BibT_eX]

[DOI]

ACM Trans. Comput. Syst., 1991

Conflict-Free Strides for Vectors in Matched Memories.

[BibT_eX]

[DOI]

Parallel Process. Lett., 1991

Module to Perform Multiplication, Division, and Square Root in Systolic Arrays for Matrix Computations.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1991

A Comparison of Redundant CORDIC Rotation Engines.

[BibT_eX]

[DOI]

John A. Harding

Jeong-A Lee

Proceedings of the Proceedings 1991 IEEE International Conference on Computer Design: VLSI in Computer & Processors, 1991

Mapping QR decomposition of a banded matrix on a ID systolic array with data contraflow and pipelined functional units.

[BibT_eX]

Proceedings of the Algorithms and Parallel VLSI Architectures II, 1991

SVD by constant-factor-redundant-CORDIC.

[BibT_eX]

[DOI]

Jeong-A Lee

Proceedings of the 10th IEEE Symposium on Computer Arithmetic, 1991

1990

Fast Multiplication Without Carry-Propagate Addition.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1990

Simple Radix-4 Division with Opterands Scaling.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1990

Redundant and On-Line CORDIC: Application to Matrix Triangularization and SVD.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1990

Nonuniform Traffic Spots (NUTS) in Multistage Interconnection Networks.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 1990

Matrix Computations on Systolic-Type Meshes.

[BibT_eX]

[DOI]

Computer, 1990

FCP Sequential Abstract Machine Characteristics for the Systems Development Workload.

[BibT_eX]

Leon Alkalaj

Ehud Shapiro

Proceedings of the Logic Programming, Proceedings of the 1990 North American Conference, Austin, Texas, USA, October 29, 1990

Architectural Support for the Management of Tightly-Coupled Fine-Grain Goals in Flat Concurrent Prolog.

[BibT_eX]

[DOI]

Leon Alkalaj

Proceedings of the 17th Annual International Symposium on Computer Architecture, 1990

An Analytical Characterization of Generalized Shuffle-Exchange Networks.

[BibT_eX]

[DOI]

Isaac D. Scherson

Peter F. Corbett

Proceedings of the Proceedings IEEE INFOCOM '90, 1990

The Performance of a Faulty Multistage Interconnection Network with Diverting Switches and Correction Links.

[BibT_eX]

Proceedings of the 1990 International Conference on Parallel Processing, 1990

A graph-based approach to map matrix algorithms onto local-access processor arrays.

[BibT_eX]

[DOI]

Proceedings of the Application Specific Array Processors, 1990

1989

Vector computations in an array computer.

[BibT_eX]

[DOI]

PhD thesis, 1989

Comments on 'A systolic array for computing BA<sup>-1</sup>'.

[BibT_eX]

[DOI]

IEEE Trans. Acoust. Speech Signal Process., 1989

Multistage Networks Including Traffic with Real-Time Constraints.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1989

On-the-fly rounding for division and square root.

[BibT_eX]

[DOI]

Proceedings of the 9th Symposium on Computer Arithmetic, 1989

Radix-4 square root without initial PLA.

[BibT_eX]

[DOI]

Proceedings of the 9th Symposium on Computer Arithmetic, 1989

1988

Graph-based Partitioning of Matrix Algorithms for Systolic Arrays: Application to Transitive Closure.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1988

Nonuniform Traffic Spots in Multistage Interconnection Networks.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1988

Implementation of fast radix-4 division with operands scaling.

[BibT_eX]

[DOI]

Ramin Modiri

Proceedings of the Computer Design: VLSI in Computers and Processors, 1988

1987

On-the-Fly Conversion of Redundant into Conventional Representations.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1987

A block-and-actions generator as an alternative to a simulator for collecting architecture measurements.

[BibT_eX]

[DOI]

Yuval Tamir

Proceedings of the Symposium on Interpreters and Interpretive Techniques, 1987, St. Paul, Minnesota, USA, June 24, 1987

On-line scheme for computing rotation factors.

[BibT_eX]

[DOI]

Proceedings of the 8th IEEE Symposium on Computer Arithmetic, 1987

1986

Reduced register saving/restoring in single-window register files.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 1986

Replication and Pipelining in Multiple-Instance Algorithms.

[BibT_eX]

Proceedings of the International Conference on Parallel Processing, 1986

1985

A reduced register file for RISC architectures.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 1985

A division algorithm with prediction of quotient digits.

[BibT_eX]

[DOI]

Proceedings of the 7th IEEE Symposium on Computer Arithmetic, 1985

1983

Reduction of Connections for Multibus Organization.

[BibT_eX]

[DOI]

Mateo Valero

Miguel Angel Fiol

IEEE Trans. Computers, 1983

Minimization of Demand Paging for the LRU Stack Model of Program Behavior.

[BibT_eX]

[DOI]

Christopher Wood

Inf. Process. Lett., 1983

A performance evaluation of the multiple bus network for multiprocessor systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Measurements and Modeling of Computer Systems, 1983

1982

Bandwidth of Crossbar and Multiple-Bus Connections for Multiprocessors.

[BibT_eX]

[DOI]

Mateo Valero

Ignacio Alegre

IEEE Trans. Computers, 1982

1978

Architectural Support for System Protection and Database Security.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1978

Effect of Replacement Algorithms on a Paged Buffer Database System.

[BibT_eX]

[DOI]

Christopher Wood

IBM J. Res. Dev., 1978

1977

Database Buffer Paging in Virtual Storage Systems.

[BibT_eX]

[DOI]

Christopher Wood

ACM Trans. Database Syst., 1977

Improving the Computation of Lower Bounds for Optimal Schedules.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 1977

A system architecture for compile-time actions in databases.

[BibT_eX]

[DOI]

Rita C. Summers

Proceedings of the 1977 annual conference, 1977

1976

A Shuffle-Exchange Network with Simplified Control.

[BibT_eX]

[DOI]

Harold S. Stone

IEEE Trans. Computers, 1976

Interconnections Between Processors and Memory Modules Using the Shuffle-Exchange Network.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 1976

Scheduling of Unit-Length Independent Tasks with Execution Constraints.

[BibT_eX]

[DOI]

Inf. Process. Lett., 1976

Scheduling as a Graph Transformation.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 1976

1975

Computation of Lower Bounds for Multiprocessor Schedules.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 1975

Definition and Evaluation of Access Rules in Data Management Systems.

[BibT_eX]

[DOI]

Rita C. Summers