Marc González

According to our database1, Marc González authored at least 45 papers between 1997 and 2016.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2016
Coarse grain parallelization of deep neural networks.
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

2015
Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

A Methodology to Build Models and Predict Performance-Power in CMPs.
Proceedings of the 44th International Conference on Parallel Processing Workshops, 2015

2014
User experience on heterogenous Liquid Galaxy cluster display walls.
Proceedings of the Proceeding of IEEE International Symposium on a World of Wireless, 2014


2013
A Systematic Methodology to Generate Decomposable and Responsive Power Models for CMPs.
IEEE Trans. Computers, 2013

Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up.
Comput. J., 2013

2012
DMA++: On the Fly Data Realignment for On-Chip Memories.
IEEE Trans. Computers, 2012

Energy accounting for shared virtualized environments under DVFS using PMC-based power models.
Future Generation Comp. Syst., 2012

POTRA: a framework for building power models for next generation multicore architectures.
Proceedings of the ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, 2012

Hardware-software coherence protocol for the coexistence of caches and local memories.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks.
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012

DMA-circular: an enhanced high level programmable DMA controller for optimized management of on-chip local memories.
Proceedings of the Computing Frontiers Conference, CF'12, 2012

2011
Local Memory Design Space Exploration for High-Performance Computing.
Comput. J., 2011

Design space exploration for aggressive core replication schemes in CMPs.
Proceedings of the 20th ACM International Symposium on High Performance Distributed Computing, 2011

2010
Automatic Prefetch and Modulo Scheduling Transformations for the Cell BE Architecture.
IEEE Trans. Parallel Distrib. Syst., 2010

Extending OpenMP to Survive the Heterogeneous Multi-Core Era.
International Journal of Parallel Programming, 2010

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL.
Proceedings of the Languages and Compilers for Parallel Computing, 2010

Decomposable and responsive power models for multicore processors using performance counters.
Proceedings of the 24th International Conference on Supercomputing, 2010

DMA++: on the fly data realignment for on-chip memories.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

Analysis of Task Offloading for Accelerators.
Proceedings of the High Performance Embedded Architectures and Compilers, 2010

Accurate energy accounting for shared virtualized environments using PMC-based power modeling techniques.
Proceedings of the 2010 11th IEEE/ACM International Conference on Grid Computing, 2010

2009
Adaptive and Speculative Memory Consistency Support for Multi-core Architectures with On-Chip Local Memories.
Proceedings of the Languages and Compilers for Parallel Computing, 2009

A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures.
Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009

Speeding Up Distributed MapReduce Applications Using Hardware Accelerators.
Proceedings of the ICPP 2009, 2009

2008
Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE Architecture.
Proceedings of the Languages and Compilers for Parallel Computing, 2008

Prefetching irregular references for software cache on cell.
Proceedings of the Sixth International Symposium on Code Generation and Optimization (CGO 2008), 2008

Hybrid access-specific software cache techniques for the cell BE architecture.
Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, 2008

2007
A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor.
Proceedings of the Languages and Compilers for Parallel Computing, 2007

2006
Runtime Address Space Computation for SDSM Systems.
Proceedings of the Languages and Compilers for Parallel Computing, 2006

A Proposal for Error Handling in OpenMP.
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2006

Techniques supporting threadprivate in OpenMP.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005
Experiences Parallelizing a Web Server with OpenMP.
Proceedings of the OpenMP Shared Memory Parallel Programming - International Workshops, 2005

Automatic thread distribution for nested parallelism in OpenMP.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

2004
Employing Nested OpenMP for the Parallelization of Multi-Zone Computational Fluid Dynamics Applications.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

2003
Automatic multilevel parallelization using OpenMP.
Scientific Programming, 2003

2002
Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling.
Proceedings of the High Performance Computing, 4th International Symposium, 2002

2001
Defining and Supporting Pipelined Executions in OpenMP.
Proceedings of the OpenMP Shared Memory Parallel Programming, 2001

Complex Pipelined Executions in OpenMP Parallel Applications.
Proceedings of the 2001 International Conference on Parallel Processing, 2001

2000
NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP.
Concurrency - Practice and Experience, 2000

OpenMP Extensions for Thread Groups and Their Run-Time Support.
Proceedings of the Languages and Compilers for Parallel Computing, 2000

Applying Interposition Techniques for Performance Analysis of OpenMP Parallel Applications.
Proceedings of the 14th International Parallel & Distributed Processing Symposium (IPDPS'00), 2000

1999
Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors.
Proceedings of the 13th international conference on Supercomputing, 1999

Exploiting Multiple Levels of Parallelism in OpenMP: A Case Study.
Proceedings of the International Conference on Parallel Processing 1999, 1999

1997
Exploiting Parallelism Through Directives on the Nano-Threads Programming Model.
Proceedings of the Languages and Compilers for Parallel Computing, 1997


  Loading...